decoration decoration
Stories

GROKLAW
When you want to know more...
decoration
For layout only
Home
Archives
Site Map
Search
About Groklaw
Awards
Legal Research
Timelines
ApplevSamsung
ApplevSamsung p.2
ArchiveExplorer
Autozone
Bilski
Cases
Cast: Lawyers
Comes v. MS
Contracts/Documents
Courts
DRM
Gordon v MS
GPL
Grokdoc
HTML How To
IPI v RH
IV v. Google
Legal Docs
Lodsys
MS Litigations
MSvB&N
News Picks
Novell v. MS
Novell-MS Deal
ODF/OOXML
OOXML Appeals
OraclevGoogle
Patents
ProjectMonterey
Psystar
Quote Database
Red Hat v SCO
Salus Book
SCEA v Hotz
SCO Appeals
SCO Bankruptcy
SCO Financials
SCO Overview
SCO v IBM
SCO v Novell
SCO:Soup2Nuts
SCOsource
Sean Daly
Software Patents
Switch to Linux
Transcripts
Unix Books
Your contributions keep Groklaw going.
To donate to Groklaw 2.0:

Groklaw Gear

Click here to send an email to the editor of this weblog.


Contact PJ

Click here to email PJ. You won't find me on Facebook Donate Paypal


User Functions

Username:

Password:

Don't have an account yet? Sign up as a New User

No Legal Advice

The information on Groklaw is not intended to constitute legal advice. While Mark is a lawyer and he has asked other lawyers and law students to contribute articles, all of these articles are offered to help educate, not to provide specific legal advice. They are not your lawyers.

Here's Groklaw's comments policy.


What's New

STORIES
No new stories

COMMENTS last 48 hrs
No new comments


Sponsors

Hosting:
hosted by ibiblio

On servers donated to ibiblio by AMD.

Webmaster
Countries' Comments on MS OOXML - How You Can Help
Tuesday, September 11 2007 @ 09:34 AM EDT

I think I see a way we could be really helpful to the ISO folks having to sort through all the 10,000 comments the various countries filed with their votes on MS OOXML.

The comments have been officially published, although as .doc files, sigh. Here's the zip file to download). But I thought I'd make them available to you as HTML also, which is how the members got them to make sure everyone has access and because of my idea. I gather someone had to process all the comments to put them into doc format, so one help would be to make sure nothing was overlooked. Other tasks might be to see that duplicates are noted, that they are sorted into various categories, like tech or not, and then subcategorized, etc. I think that might prove helpful too in making sure everything is addressed.

But the real help is this: Alex Brown has written on his weblog about another sorting he'd find helpful as he tries to get the comments properly sorted:

One curiosity of the ballot results is the degree of skepticism accompanying the votes of approval. Normally an approval vote in an ISO ballot means that the technical content has been approved. However, some of the comments accompanying approval votes look to me like they crave resolution. Indeed, Greece has gone so far as to accompany its approval vote with the following statement:
"If the Ballot Resolution Group fails to resolve satisfactorily the issues, then ELOT will reconsider its position and may cast a vote of disapproval during the BRG meeting(s) according to article 13.8 of the JTC1 directives, or may even appeal to the final adoption of the Standard."

This introduces a complication for the BRM. As convenor, one of my responsibilities is to run the meeting in a such a way that it maximises the chances of approving a text. One natural way of doing this is to de-prioritise comments that accompanied an approval vote, on the basis that those countries are already happy with the text. However, for Greece this evidently isn't an accurate assumption and the same may be true of other countries too. I need to find out which...

It occurs to me that this is exactly the kind of task that a computer might be helpful in achieving, so I'm throwing the idea out there in hopes that you guys might figure out a way to help him out. So, with that goal in mind, here is the zip file you can download to view the comments in the original HTML. The zip file is almost 3 MB, and the files are categorized by name of the organization, not by country. By that I mean, France's comments are categorized as AFNOR. So that's another sorting job.

It will also make it possible for the public to follow along as they address all the technical and other issues that blocked approval at the meeting in February and thereafter. Here's the process, explaining what happens in February, so you understand the importance of getting the sorting done well before the meeting begins.


  


Countries' Comments on MS OOXML - How You Can Help | 226 comments | Create New Account
Comments belong to whoever posts them. Please notify us of inappropriate comments.
Corrections thread
Authored by: lannet on Tuesday, September 11 2007 @ 09:49 AM EDT
Corrections in the title line please

---
When you want a computer system that works, just choose Linux.
When you want a computer system that works, just, choose Microsoft.

[ Reply to This | # ]

OT thread
Authored by: lannet on Tuesday, September 11 2007 @ 09:51 AM EDT
A summary in the title line please and hyperlinks that read sensibly

---
When you want a computer system that works, just choose Linux.
When you want a computer system that works, just, choose Microsoft.

[ Reply to This | # ]

Newspick comment thread
Authored by: lannet on Tuesday, September 11 2007 @ 09:52 AM EDT
Please make it clear which newpick you are addressing

---
When you want a computer system that works, just choose Linux.
When you want a computer system that works, just, choose Microsoft.

[ Reply to This | # ]

Microsoft Response ? to this suggestion
Authored by: Anonymous on Tuesday, September 11 2007 @ 10:01 AM EDT
I wonder what Microsoft will say about this suggestion?

With these sortings you could detect boilerplate answers and comments

[ Reply to This | # ]

Countries' Comments on MS OOXML - How You Can Help
Authored by: Anonymous on Tuesday, September 11 2007 @ 10:09 AM EDT
interesting it was released as a .doc file. isn't that what we are all trying
to change? ironic.

why wasn't it released as a micro$oft ooxml? or an iso approved document
standard (open document)?

[ Reply to This | # ]

Not our problem
Authored by: Anonymous on Tuesday, September 11 2007 @ 10:20 AM EDT
It's ECMA's job along with Microsoft... ISO should just dump the lot back on
them... if they can't resolve it in the time they have left before the next
vote, then the "standard" has to be rejected and removed from the fast
track.

So my stance here is that we should not be helping Microsoft and ECMA to resolve
the comments.

just my tuppence worth...

[ Reply to This | # ]

Let people help organize the information on Groklaw
Authored by: Anonymous on Tuesday, September 11 2007 @ 10:25 AM EDT
It would be great if people's help could be enabled right here at Groklaw,
perhaps in a wiki environment (similar to the BSI but allowing help from the
public). Imagine if Groklaw itself became the premier source of information on
OOXML comments, containing source-text, links, categories, etc.

[ Reply to This | # ]

Ideas for how to go about this.
Authored by: Sean DALY on Tuesday, September 11 2007 @ 11:21 AM EDT
I have reviewed some of the HTML files and most information is in tables.

Of these, most table rows refer to a specific portion of the proposed spec.

Although the table columns were meant to be "normalized", each Member
Body (MB) presented the same information in slightly different ways; for
example, "Part 4, Section 3.17.4" or "Part 4: 3.17.4.", with
embedded newlines and illegal HTML character encodings (byproduct of Word?).
Some lines do refer to more than one section, and would need to be
"deconsolidated" in order to refer to each section.

Several files contain copious footnotes or general comments which would need to
be integrated into the table (cf. AFNOR).

All that said, there are not a huge number of MBs, so I think the manual work of
extracting the tables and normalizing the columns is not excessive.

Useful CLI tools I can think of to do this are htmltidy and gawk.

Each table could be transposed to a normalized table, then all the tables
combined in an XML file, presentable as a simple spreadsheet list.

The CLI toolkit xmlstarlet could be used to query the super-table, by document
section. Or more simply, a report could be generated with document section as
the key and ISO 3166-2 country code (MB) as the secondary key. A first report
could indicate the count of comments by section, indicating where to prioritize
manual inspection.

[ Reply to This | # ]

Member Organizations associated by country
Authored by: mdarmistead on Tuesday, September 11 2007 @ 12:18 PM EDT
Countries, organizations, and vote for votes with comments Sorted by Organization Extracted from: ISO Members

ABNT - BRAZIL - Disapproval
AENOR - SPAIN - Abstain with Comments
AFNOR - FRANCE - Disapproval
ANSI - UNITED STATES - Approval with Comments
BDS - BULGARIA - Approval with Comments
BIS - INDIA - Disapproval
BPS - PHILIPINES - Disapproval
BSI - GREAT BRITAIN - Disapproval
CNI - CZECH REPUBLIC - Disapproval
DGN - MEXICO - Abstain with Comments
DIN - GERMANY - Approval with Comments
DS - DENMARK - Disapproval
DSM - MALAYSIA - Abstain with Comments
ELOT - GREECE - Approval with Comments
FONDONORMA - VENEZULA - Approval with Comments
GSB - GHANA - Approval with Comments
ICONTEC - COLOMBIA - Approval with Comments
INDECOPI - PERU - Abstain with Comments
INEN - ECUADOR - Disapproval
INN - CHILE - Abstain with Comments
INNORPI - TUNISIA - Approval with Comments
IPQ - PORTUGAL - Approval with Comments
IRAM - ARGENTINA - Abstain with Comments
ISIRI - IRAN - Disapproval
JISC - JAPAN - Disapproval
JISM - JORDAN - Approval with Comments
KATS - KOREA, REPUBLIC OF - Disapproval
KEBS - KENYA - Approval with Comments
MSA - MALTA - Approval with Comments
NBN - BELGIUM - Abstain with Comments
NSAI - IRELAND - Disapproval
ON - AUSTRIA - Approval with Comments
PKN - POLAND - Approval with Comments
SA - AUSTRALIA - Abstain with Comments
SABS - SOUTH AFRICA - Disapproval
SAC - CHINA - Disapproval
SCC - CANADA - Disapproval
SFS - FINLAND - Abstain with Comments
SII - ISRAEL - Abstain with Comments
SN - NORWAY - Disapproval
SNV - SWITZERLAND - Approval with Comments
SNZ - NEW ZEALAND - Disapproval
SPRING SG - SINGAPORE - Approval with Comments
TISI - THAILAND - Disapproval
TSE - TURKEY - Approval with Comments
UNI - ITALY - Abstain with Comments
UNIT - URUGUAY - Approval with Comments

[ Reply to This | # ]

OpenISO.org
Authored by: nb on Tuesday, September 11 2007 @ 01:58 PM EDT
Another initiative which addresses OOXML (besides also aiming to build up a truly open international standardization organization) is OpenISO.org.

What OpenISO.org intends to produce in regard to OOXML is a report that will be titled "OI PR-F29500 OpenISO.org Problem Report about OOXML". This will list and explain only the serious "show-stopper" issues about OOXML.

I honestly believe that Microsoft is extremely unlikely to ever fix the big issues that determine the difference between an honest effort at developing a good standard (which truly everyone can use) on one hand, and an anti-competitive weapon for the standards war on the other hand. Helping Microsoft with improving OOXML in regard to the technical details (without inisting that the big issues should be addressed first) is only going to help Microsoft turn OOXML into a more effective anticompetitive weapon.

[ Reply to This | # ]

No subsequent P last-minute-upgrades??
Authored by: Anonymous on Tuesday, September 11 2007 @ 02:11 PM EDT
from the link PJ posted:
"P-member and O-member status

"Since ballot resolution is an extension of an existing ballot in which
countries have voted with a certain status, for the purposes of the BRM P-Member
and O-member ISO status is counted as at the close of the five-month ballot on 2
September i.e., any subsequent status changes are discounted for voting
purposes."

Is this so?

nachokb

[ Reply to This | # ]

Countries' Comments on MS OOXML - How You Can Help
Authored by: kyfung on Tuesday, September 11 2007 @ 02:57 PM EDT
I went through a few documents. It appears to me that the Word documents are the
original. The HTML files seem to be generated from the Word files by something
(LiveLink?). For instance, the AFNOR submission has a nice diagram in the Word
file, but in the HTML version, it is totally screwed up.

I wrote a little Java program that extracts out the content of the table in the
HTML files, and converts the content into a simple XML document. It seems to
work okay for the files that I checked.

I have no idea how to handle the preamble, the footnotes, the references,
annexes. What should be done with them?

I am going to try to make sense of the references now; I wonder how many
different grammars I will need to write.

[ Reply to This | # ]

US Foreign Policy (sanctions)
Authored by: Anonymous on Tuesday, September 11 2007 @ 03:23 PM EDT
ISO members (Iran) that are under sanctions of the US Gov.
cannot impliment any standard that requires a patent
deal with an american company (Microsoft).
ISO must be seen as independant of US foreign policy.

[ Reply to This | # ]

html version here - Countries' Comments on MS OOXML - How You Can Help
Authored by: Anonymous on Tuesday, September 11 2007 @ 10:34 PM EDT
You can find an .html version of comments here:

I will update the time and date stamp as I go along.

http://www .iserv.net/~gporos/ooxml_comments/comment_list.html

[ Reply to This | # ]

Document Index by Country Code
Authored by: Anonymous on Wednesday, September 12 2007 @ 12:00 AM EDT
The following is an index of the MS-Word documents by country code. So, for example if you are looking for the Canadian comments, they are found in J1N8726-38.doc

Country Document

  • AR - 24
  • AT - 33
  • AU - 35
  • BE - 31
  • BG - 5
  • BR - 1
  • CA - 38
  • CH - 42
  • CL - 21
  • CN - 37
  • CO - 18
  • CZ - 9
  • DE - 11
  • DK - 12
  • EC - 20
  • ECMA - 14
  • ES - 2
  • FI - 39
  • FR - 3
  • GB - 8
  • GH - 17
  • GR - 15
  • IE - 32
  • IL - 40
  • IN - 6
  • IR - 25
  • IT - 47
  • JO - 27
  • JP - 26
  • KE - 29
  • KR - 28
  • MT - 30
  • MX - 10
  • NO - 41
  • NZ - 43
  • PE - 19
  • PH - 7
  • PL - 34
  • PT - 23
  • SG - 44
  • TH - 45
  • TN - 22
  • TR - 46
  • US - 4
  • UY - 48
  • VE - 16
  • ZA - 36

[ Reply to This | # ]

computers, comments and context
Authored by: Anonymous on Wednesday, September 12 2007 @ 08:57 AM EDT
"It occurs to me that this is exactly the kind of task that a computer might be helpful in achieving,"

Computers don't understand context. Humans will need to read the comments and sort them.

If you want to live dangerously, you could enabling rating/voting(e.g. 1 to 10 scale, 1 being a negative comment, 10 being positive) for each comment and have a computer tally the results. This might give you a general consensus on whether a comment is positive or negative and then a computer sort the comments by the numerical value humans assigned to it.

Think group image tagging but for comments and/or Slashdot's comment moderation system.

[ Reply to This | # ]

First version available
Authored by: kyfung on Wednesday, September 12 2007 @ 06:00 PM EDT
Okay, I got the first version of a website ready for use. No question about it,
it has tons of problems. So, try it and let me know what is wrong with it.

This is what I have done so far:
1. Parsed the comments into XML.
2. Stored the comments into a database.
3. Provided three simple pages; one is the home page, one to query the
countries' comments, and one to search sections that I could handle so far.

Tons to do:
1. Clean up the comments one country by one country.
2. Provide a reference editing tool to input the right section references into
the database. This is going to take a lot of manual time.
3. Provide a search engine.
4. Provide editing tools to edit the text from the website. Of course, the
original text will always be kept, so that a comparison can be done, always.
5. A reporting tool to report according to whatever criteria you have in mind.
(Oh yeah, this is the goal of the whole thing)
6. Clean up the code.

Let me know what features you want to have.

Anyhow. I am done for today. I will continue tomorrow.

The URL is: http://www.khunyeefung.com/dis29500. If people actually use the
site, I will buy a domain name for it. I will save the $10 for now.

When the thing is in a better shape, the code will be released under GPL,
assuming somebody is interested in getting the code.

[ Reply to This | # ]

OOXML brittle for business reasons?
Authored by: Jose on Friday, September 21 2007 @ 08:43 PM EDT
Quoting from the 14th comment at
http://www.linuxtoday.com/infrastructure/2007091804226OPMSSW (dated Sept 22).

************
Consider a situation where you are following instructions to get to a location.
If you miss the part about turning left, everything else is thrown off, and you
might just end up looking for penguins on the wrong side of the Equator.

XML, no matter what it contains, can lead to problems of this nature (eg, if you
miss a tag or an attribute); however, XML can be fairly robust in many cases
because there is some amount of redundancy. For example, a mispelled tag can
usually be guessed/matched exactly because of the begin/end pair sequences and
because XML usually uses long English words with redundancy. The more serious
problems occur if a lot of the data in the stream is brittle. It seems that
OOXML has this brittle data if it is based on actions occuring to data instead
of just presenting the final data in all it's glory. This hypothesis should be
tested out more precisely by putting OOXML up against ODF and introducing
various sources of errors into different samples of different types of files.
Also, the fact OOXML uses shorter elements names (I think) weakens some of the
protection I mentioned earlier that a typical XML file might otherwise possess.

If OOXML is in fact potentially much more brittle, no amount of Microsoft
marketing will fix this issue. OOXML, if designed this way, may have sealed its
fate.

My guess is that the Microsoft devs probably made a business decision to follow
this format as a way to make it more difficult on competitors.. how ironic now
that Microsoft is the weaker more resource-challenged competitor compared to the
FOSS community. Also, this may make it easier for Microsoft to hide secret
manipulations within Excel since (if) poking into the file won't give you the
final state of the data. Any small change throughout the stream can completely
change the meaning of everything in a way only Microsoft would know about
intially and which may make for a very different end result only Excel can
reproduce (note the penguin example above). It makes it less likely that the
format will be reverse engineered and tricks discovered. So the key here is that
there can be strange (undocumented.. even in OOXML) ways for data from distinct
and far removed places of the XML file to interact. There is a lot that XML
doesn't define or prohibit. The clincher though (since Microsoft can do this to
ODF [files it wrongly interprets/produces based on "misunderstandings"
of the spec]) is that the final state of the data is never even almost seen [I
am assuming this point [...]]. Microsoft's tricks won't just affect a word or
markup here or there, but may mean that a very different file is presented that
cannot be *reconstructed* at all without the key ingredients/instructions.

This brings up another anology. Imagine looking at two sets of blueprints for a
building. These blueprints have blemishes on them. The first set is of the final
state of [the] building (internal and external). The other set is of the blocks,
steel, 2x4's, carpeting, and other building components that compose the building
plus the assembly instructions. If the photographs have problems and if the
building is diverse enough, which of the photographs would you say is more
likely to lead meticulous architects and engineers to come up with an accurate
reconstruction? In fact, small blemishes on key portions of the assembly
instructions can seal the fate of any hope for a reconstruction relying on this
second set of blueprints

Microsoft may have built their format to maximize the ability to throw
competitors off their trail but sacrificed robustness to get there. This would
mean that OOXML would have no place as an open format as it would be technically
flawed with it's strengths in an area that would be made irrelevant if the
standard was truly intended to be and remain open.

I hope someone is following up on this sort of hypothesis using real tests to
get actual data and statistics. If this has been done already (I haven't read
the ISO comments), then I hope this sort of technical issue is communicated (in
layman's language) to stakeholders.
************


[ Reply to This | # ]

Groklaw © Copyright 2003-2013 Pamela Jones.
All trademarks and copyrights on this page are owned by their respective owners.
Comments are owned by the individual posters.

PJ's articles are licensed under a Creative Commons License. ( Details )