Table of
Contents
1 | Introduct*on | 3 |
1.1 scope | 3 | |
1.2 abbrev*at*ons | 3 | |
1.3 References | 3 | |
1.4 open Formats | 4 | |
2 | XML and other related standards | 5 |
2.1 XML | 5 | |
2.2 OffIce Open XML – ISO/IEC DIS 29500 | 6 | |
2.3 ODF -ISO/IEC 26300 | 10 | |
3 | D*scuss*ons on ODF and Open XML | 11 |
3.1 ODF | 11 | |
3.2 Open XML | 13 | |
4 | Conclus*on | 15 |
1. Introduction
1.1. Scope
This report has been prepared for commenting on the draft standard ISO/IEC DIS 29500 [5] of which voting process is going to end at the beginnig of September 2007. The standard basically includes the description of office documents by XML which is a standard specification language. This standard and the similar ones are specifically important for the interoperability of organizations. The report is prepared by the ICT Expert Group of ISO.
In the report, the concepts of open formats, XML, ODF and Office Open XML are explained first, and then, the discussions on the above mentioned standard and rationale for the voting were explained.
1.2 Abbreviations
ANSI | : American National Standards Institute |
AVI | : One of audio file compression formats |
ICT | : Information and Communication Technology |
BMP | : One of picture file compression formats |
ECMA | : European Computer Manufacturers Association |
GML | : Generalized Markup Language |
HTML | : HyperText Markup Langage |
IEC | : International Electrotechnical Committee |
IPR | : Intellectual Property Rights |
ISO | : International Standards Organization |
ODF | : Open Document Format |
OOXML | : Office Open XML |
SGML | : Standardized Generalized Markup Language |
TSO | : Turkish Standards Organization |
W3C | : World Wide Web Consortium |
XML | : eXtended Markup Language |
Y2K | : year 2000 problem |
ZIP | : One of file compression formats |
1.3 References
1 | A Brief History of the Development of SGML, http://www.sgmlsource.com/history/sgmlhist.htm |
2 | Ecma international. TC45 - EXPLANATORY REPORT ON OFFICE OPEN XML STANDARD (ECMA-376) SUBMITTED TO JTC 1 FOR FAST-TRACK |
3 | http://en.wikipedia.org/wiki/Office_Open_XML |
4 | http://www.forumex.net/asp-perl-php-html/35649-xml-ve-xml-uygulamalari.html |
5 | ISO/IEC 26300:2006 Information technology -- Open Document Format for Office Applications (OpenDocument) v1.0 |
6 | ISO/IEC DIS 29500 (ECMA 376:2006 Office Open XML File Formats) |
7 | Mark Johnson, XML for the absolute beginner, http://www.javaworld.com/javaworld/jw-04-1999/jw-04-xml.html |
8 | Micheal Morrison et al.,XML Unleashed, Sams Publishing, 1999. |
9 | Ralf I. Pfeiffer, Tutorial 1: Overview of XML, http://www4.ibm.com/software/developer/education/tutorial-prog/overview.html |
10 | Richard Anderson et al., Professional XML, Wrox Press Ltd., 2000. |
1.4 Open Formats
At the beginning of the millenium, the countries of the world who had solved their Y2K problems, accelerated their e-Transformation process. This resulted with the exhaustive usage of internet all over the world. Over next decade, the commerce is expected to move on internet and the world is going to a life dependent on ICT with high speed. The countries who are well developed on ICT usage are also changing their economy into “knowledge economy”too. The societies are changing to knowledge society while classical economies are transforming to knowledge economy.
Undoubtedly, the basic infrastructure for e-transformation is ICT. The governments and organizations should expand their ICT usage as much as they can and they should reorganize their busines processes accordingly. Business environments are changing to “paperless offices” while arcives are being replaced by eloctronic counterparts. The office software such as word processes, spreadsheets, presetanation and database applications are becoming a natural part of daily business life.
Today, for all ICT applications, it is almost compulsory
This obligation has resulted with the development of XML1 (eXtended Markup Language) which is accepted by all parties withot any hesitation. XML which is developed by W3C (World Wide Web Consortium) became a unique language for data interchange and substituted with classical EDI. Today, the information and services given on web sites are all provided via XML.
Open XML formats needs to be standardized because of following major reasons: :
2. XML and related standards
2.1 XML
XML is the Extensible Markup Language in expansion. XML is a basic and flexible text formatting technology, which can be used as a strategical tool in the field of electronic commerce, electronic data interchange, data management, and searc engines motors. Structures, contents and concepts of data can be represented in stand-alone way, without having dependencies on a platform, a company and a language.
By the development of press publications, the notes and special symbols that publishers prepare to mark in the press machines are represented as “markup”. This is a process of marking to emphasise a certain parts of texts. The marks, rules and grammar info sets used are defined as a “markup language”
The word processor programs contain many marks embedded into the text to parse types, parts and styles. Programming languages use several symbols anda marks to parse functions, data structures and data. Without using this kind of a parser, a mark or a set of tags, there is no possibility that an application can be developed [4].
The first markup language GML (Generalized Markup Language) was first used in the end of 1960 to transfer, share and process texts and documents as a result of research in IBM. The GML was then improved by a group set up in ANSI (American Natitional Stardard Institute) in 1978 and was accepted by ISO (International Organization for Standardization) as a Standard named SGML (Standardized Generalized Markup Language) in 1986. SGML is a language that determines a language’s semantics and grammar used in text and documents sets. SGML is already used as a documentation Standard in the organisations of US government, avionics, aotomative and press industry. SGML cannot be used commonly due to its high developing and application cost as well as its highly complicated structure despite it is a very powerful language [1].
In 1989, Tim Berners-Lee and Anders Berlung developed HTML (Hypertext Markup Language) to easily share documents through the internet, which is one of the fundemantal elements of web applications. HTML was developed as an application of a SGML. In other words, the structure of the HTML language has been defined in the SGML. HTML is a language to display and format information in a Standard form, such as a header, a script type, a picture and a table in computer environment. The presentation of a document is realised through the use of several marks called tags. The main purpose to develop this language is to display and present a document in a Standard form. The development of this language for only web browsers, as well as many other restrictions have caused the XML language to be developed afterwards [7].
In 1996, the Word Wide Consortium (W3C, http://www.w3.org) started to design the XML Language as a simple markup language, in order to include the strength and flexibility of SGML. In February 1998, XML 1.0 was published by the W3C as a Standard. XML is a simplified language, including many features of the SGML Language and is a subset of the SGML. XML is a meta language just as the SGML. In other words, it is used to define structures of other languages [9].
XML is also a language that uses the tags as the HTML does. The main difference between the HTML and the XML is that the tags are used to define the contents of information. The XML is a meta language. It is used to define the other markup languages.
Using XML, an application special markup language can be defined for any application to express the content of data and data types. Metadata or Metainformation is information about data. The XML tags define meta data about data [8].
The XML introduce an appropriate medium to define and represent various data, concept and contents. For this reason, the XML becomes widespread rapidly as a strategical tool for definition and transfer of application data in various areas, without depending on producer, language and platform. Some of the main application areas in which the XML has been used and to be used are given below [10]:
2.2 Office OPEN XML - ISO/IEC DIS 29500
The Office Open XML (commonly abbreviated as OOXML) is a file format specification for electronic documents such as word processing documents, presentations, charts, spreadsheets, books and reports. The OpenXML is an open standard draft and it can be freely implemented by multiple applications on multiple platforms. The major benefit of publication of this standard is stated as provision of a common platform for all organizations developing application software, an the entities using those software as well as for the educators or authors who teach the format [3].
The work to standardize Open XML has been carried out by Ecma International via its Technical Committee 45 (TC45), which includes representatives from Apple, Barclays Capital, BP, The British Library, Essilor, Intel, Microsoft, NextPage, Novell, Statoil, Toshiba, and the United States Library of Congress [3].
Office Open XML format uses a ZIP container for packaging XML and other data files. The main advantage of Open XML is backward compatibility and it supports the files created before the Open XML format. It has been declared that the Open XML standard draft meets the European Union definition of an Open Standard.
The Open XML is a file system that contains the individual files that form the basis of a document. In addition to XML files the ZIP package can also include binary files in formats such as PNG, BMP, AVI or PDF.
Open XML was designed from the start to be capable of faithfully representing the pre-existing corpus of word processing documents, presentations, and spreadsheets that are encoded in binary formats defined by Microsoft Corporation. The standardization process consisted of mirroring in XML the capabilities required to represent the existing corpus, extending them, providing detailed documentation, and enabling interoperability. At the time of writing, more than 400 million users generate documents in the binary formats, with estimates exceeding 40 billion documents and billions more being created each year [6].
Concurrently with diversified
marketing, a new range of applications not originally contemplated in the
document editing programs are introduced. These new applications include ones
that:
It is declared that this standard
draft has the capability of long-term preservation. In parallel with the
developments in science and related fields we have learned to create
exponentially increasing amounts of information. Those information has been
encoded using digital representations that are deeply coupled with the
programs that created them after a decade or two, they routinely become
extremely difficult to read without significant loss. Preserving the
financial, scientific, intellectual and other related investment in those
documents (both existing and new) has become a pressing priority
[2].
It is declared that there are four
main reasons to introduce the Open XML standard draft: extremely broad
adoption of the binary formats, market forces hat demand diverse applications,
technological advances, and the increasing difficulty of long-term
preservation. These reasons cause the not only the development of the draft,
but also they cause the migration of billions of documents to it with as
little loss as possible. On the other hand, standardizing that open XML format
and maintaining it over time create an environment in which any organization
can safely rely on the ongoing stability of the specification, confident that
further evolution will enjoy the checks and balances afforded by a standards
process [2].
Various document standards and
specifications exist; these include HTML, XHTML, PDF and its subsets, ODF,
DocBook, DITA, and RTF. Like the numerous standards that represent bitmapped
images, including TIFF/IT, TIFF/EP, JPEG 2000, and PNG, each was created for a
different set of purposes. Open XML addresses the need for a standard that
covers the features represented in the existing document corpus. It is
declared that it is the only XML document format that supports every feature
in the binary formats [3].
OpenXML defines the formats for word processing, presentation and spreadsheet documents. Each document type is specified by one of the WordprocessingML, PresentationML or SpreadsheetML markup languages.
Some Feautres of OpenXML Draft Standard Draft
In this section some of the important features of OpenXML are given.
2.3 ODF - ISO/IEC 26300
The Standard “ISO/IEC 26300: Information technology – Open document format for office applications (Opendocument) V1.0” about open formats was published by ISO in 2006. OpenDocument is the abbreviation of OASIS Open Document Format for Office Applications and also known as ODF. ODF is a document file format that it is used to define notes, reports, books, electronic tables, schemas and word processor files. This Standard is developed by a Technical Committee under the consortium of “Organization for the Advancement of Structured Information Standards” and is based on the XML format first developed and implemented by the OpenOffice.org developer of Office Applications. The Standard ODF can be freely obtained and used. Therefore, this Standard satisfies all fundamental definitions that an open Standard has to employ. In other words, a software developer can learn about details of this format and can develop application softwares that can read these files and that can produce files in this form. This file format is also the format that the applications such as OpenOffice.org 2.0, KOffice 1.5, StarOffice 8, IBM Workplace etc use. The Standard ISO/IEC 26300 published by ISO (International Organization for Standardization) in May the 1st, 2006 has formed a file format that can be used worldwide to store files produced by the Office applications. This Standard is also the first Standard in the world in its area. The users of software are guarantied that they can use their data now and in the future, using an appropriate software package. This means that any open Standard compatible application can use it.
The Standard ISO/IEC 26300:2006 defines XML schemas and semantics for the Office applications. The schema defined here can be implemented for the Office documents such as text documents, electronic tables, schemas, drawings and presentations. However, the area of applications is not limited by the examples mentioned above.
The Standard ISO/IEC 26300: 2006 provides advanced-level information for organisations of documents. It also defines apropriate structures of XML for the Office documents and is also convenient for conversions to be performed by using XML based or similar tools.
The Standard ISO/IEC 26300: 2006 primarily provides beginners level information for the OpenDocument format and explains the structure of documents satisfying OpenDocument specifications. It also presents meta data of these documents and Ayr*ca bu dokümanlarda yer alabilen meta information and paragraph and text contents of this information.
ISO/IEC 26300:2006 defines the content table of a document of the OpenDocument format, its graphical content, schema content and content format. It also defines a common content info for all documents [5].
3 Discussions related to ODF and Open XML
3.1 ODF
The biggest proof that open document format standard evolution continues is because these studies are regularly criticized and the need to renew and update them is apparent It is also obvious that the criticisim related to these formats will continue for a long time. When we look at the ODF standard, we see the following basic critisisms about it:
3.2 Open XML
For a group of documents with different requirements and where ODF is not able to find solutions, another open document format standard (Open XML) is being prepared. ISO/IEC DIS 29500 is still a draft standard and there are many ongoing discussions and criticisms’ related to this work.
Open XML criticisms reached to a point where technical issues are less important than the commercial and political ones. The following are some of the important criticisms related to this draft standard:
4 Conclusion
the ISO/IEC DIS 29500 is evaluated as POSITIVE to be accepted as a Standard and a positive country vote is recommended.
1 www.xml.org, www.w3.org/TR/REC-xml/
|
![]() |
|