decoration decoration
Stories

GROKLAW
When you want to know more...
decoration
For layout only
Home
Archives
Site Map
Search
About Groklaw
Awards
Legal Research
Timelines
ApplevSamsung
ApplevSamsung p.2
ArchiveExplorer
Autozone
Bilski
Cases
Cast: Lawyers
Comes v. MS
Contracts/Documents
Courts
DRM
Gordon v MS
GPL
Grokdoc
HTML How To
IPI v RH
IV v. Google
Legal Docs
Lodsys
MS Litigations
MSvB&N
News Picks
Novell v. MS
Novell-MS Deal
ODF/OOXML
OOXML Appeals
OraclevGoogle
Patents
ProjectMonterey
Psystar
Quote Database
Red Hat v SCO
Salus Book
SCEA v Hotz
SCO Appeals
SCO Bankruptcy
SCO Financials
SCO Overview
SCO v IBM
SCO v Novell
SCO:Soup2Nuts
SCOsource
Sean Daly
Software Patents
Switch to Linux
Transcripts
Unix Books
Your contributions keep Groklaw going.
To donate to Groklaw 2.0:

Groklaw Gear

Click here to send an email to the editor of this weblog.


To read comments to this article, go here
Does MS Office SP2 With ODF Support Really Work? Test Results Point to No. - Updated 3Xs
Monday, May 04 2009 @ 12:14 AM EDT

I tried the updated Microsoft Office 2007 SP2, which supports ODF, or says it does. I created a document in Office 2007 SP2 and saved it as ODF. I got an ominous Microsoft warning that if I persisted, I might lose some formatting -- "Document [name] may contain features that are not compatible with this format. Do you want to continue to save in this format?" -- but it saved the document when I clicked Yes. I reasoned that OpenOffice, which I intended to use to test the result, does have the features I wanted. I had included one footnote, a photo, and a text block, all of which OpenOffice can do, but when I opened the saved document in OpenOffice, none of it looked right. You couldn't read the footnote at all, because it's cut horizontally in the middle of the text. You can see it's there, but you can't make out the words.

I thought most of the problems, and there were others, might be my fault though, because I've never used Office 2007 before, since I don't own it, and I found it very confusing. Because I don't own Office 2007, and I had limited access time to test on someone else's, I looked around to see if anyone else was reporting results in the new SP2. I asked Groklaw members if they had tried it out yet and how it worked for them. A Groklaw member, Dobbo, did a test working on a spreadsheet with a client, and his experience was also a failure.

It turns out Rob Weir also did some tests of spreadsheets, and it's not just me and Dobbo, and it's not just Word documents. If you are a government trying to decide if now's the time to adopt OOXML, because ODF is now allegedly supported, you probably want to read his Update on ODF Spreadsheet Interoperability. He compares all the options, which he tested a couple of months ago, OpenOffice, Google Docs, Clever Age, Sun's plugin, KSpread and Symphony, adds in the new Office 2007 SP2, and makes the comparison. I think it's fair to say that SP2 does worse than any of the others, or at best they fail equally in some categories. Clever Age actually did better on the tests than Office 2007 SP2. Why? What could possibly explain that?

Why would Microsoft be worse than the others? Here's a bit on that from Weir's article:

The new entry to the mix in this round is Office 2007 SP2, which has added integrated ODF support. Unfortunately this support did not fare well in my tests. The problem appears to be how it treats spreadsheet formulas in ODF documents. When reading an ODF document, Excel SP2 silently strips out formulas. What is left is the last value that cell had, when previously saved.

This can cause subtle and not so subtle errors and data loss. For example, in the test document I presented above, the current date is encoded using the TODAY() spreadsheet function. If the formulas are stripped, then this cell no longer updates and will return the wrong value. Similarly, if Maya tries to continue her ledger of expenses by copying the formula cells from column E down a row, this will cause incorrect answers, since there is no longer a formula there. In general, SP2 converts a spreadsheet into a "table of numbers" and any calculation dependencies are lost.

In the other direction, when writing out spreadsheets in ODF format, Excel 2007 SP2 does include spreadsheet formulas but places them into an Excel namespace. Not every ODF application checks the namespace of formulas, but the ones that do reject the SP2 document altogether. And the ones that do not check the namespace try and fail to load a formula that is syntactically different than what they expected and they essentially display a corrupted document.

What he found is that if you create a spreadsheet in Microsoft Office 2007 SP2, it won't render properly in Google Docs, OpenOffice, KSpread, Symphony, or with the Sun Plugin. If you create the document in OpenOffice and then try to read it in Microsoft Office 2007 SP2, you can't do that either. He found that spreadsheet interoperability is actually worse now than when he last tested:
I must admit that I'm disappointed by these results. This is not a step forward. This is a step backwards compared to where we were two months ago. Spreadsheet interoperability is not hard. This is not rocket science. Everyone knows what TODAY() means. Everyone knows what =A1+A2 means. To get this wrong requires more effort than getting it right. It is especially frustrating when we know that the underlying applications support the same fundamental formula language, or something very close to it, and are tripped up by lack of namespace coordination. Whether it is accidental or intentional I don't know or care. But I cannot fail to notice that the same application -- Microsoft Excel 2007 -- will process ODF spreadsheet documents without problems when loaded via the Sun or CleverAge plugins, but will miserably fail when using the "improved" integrated code in Office 2007 SP2.
If you or they say, "It's ODF's fault," why, then, is everyone else able to make it work better than Microsoft, which is what Weir found? They funded Clever Age, after all, didn't they? So why would Clever Age work better than Microsoft's support for ODF in Office? It makes no sense. I remember the tricks to make sure DR DOS and WordPerfect didn't work quite right. Is this something similar? Incompetence? What? In any case, Weir tells them how to fix it, if it's that. But here's what I learned: you can be conformant to a standard and *not* interoperable at the same time. Maybe someday an EU Commission investigation can figure out if it's deliberate or not. Meanwhile, we still can't interoperate with Microsoft Office 2007 SP2. So if you are going to send me a document, until Microsoft offers actual interoperability, please don't use that to create your document or I might not get your complete message. Drop down to saving as a .doc or the other older versions of Excel or whatever for my sake and for the sake of others who don't use Microsoft software.

Here's Dobbo's experience:

Microsoft Office 2007 SP2 OOo Compatability via ODF
Authored by: dobbo on Saturday, May 02 2009 @ 09:11 AM EDT

In a past Groklaw article PJ was asking about interoperability between Microsoft's Office suite and OpenOffice now that MS-Office supports ODF. This is also of interest to me as I am current working on a project that has to process data coming from an Excel spreadsheet, and an ODF dump from Excel might be advantageous to me.

As my client uses Excel I pointed him at the upgrade which he downloaded and installed. His current Excel spreadsheet is a masterpiece of complexity: hidden sheets and cells, lots of validation macros and so on and so forth. But I wondered just how interoperable with OOo it would be.

It wasn't!

For a start it doesn't look the same, or so I am told. I don't have Microsoft here, just Linux so I have to use OOo to view the spreadsheet. But my client reported that OOo running on his computer didn't render the sheets the same.

But the bigest problem (for me at least) was the fact the the macros didn't work. Here is a simple example:

Output from MS-Office: =COUNTIF(A23:A922,"S")+COUNTIF(A25:A924,"SS")
The problem is the Microsoft has used commas as argument separators and OOo uses semicolons.

Actually, as Weir shows, it's a much bigger problem, and not just for OpenOffice. And the patch must come from Microsoft. Clearly they could make spreadsheet interoperability work if they wanted to, since others already are.

Dear Microsoft, could you please do something about this? It's just code, which means it can be fixed. But your code is proprietary, so we can't fix it. Only you can. Like the old song says, could you please put on some speed? Others like Google Docs seem to be able to do ODF spreadsheets. Why can't you? No doubt there will be improvements, but when?

Meanwhile, folks, don't be fooled into thinking that life in the OOXML-ODF interoperability universe is now suddenly a breeze. It's not.

Here's the result of my first mangled attempt to open a Microsoft 2007 SP2 document, saved as .ODT and opened in OpenOffice, and yes, I'm blushing:

I feel just like that squirrel when trying to interoperate with Microsoft. And while I openly confess that I didn't know what I was doing in Office 2007, the footnote did work, up to a point, as far as my input was concerned. The cutoff problem isn't from me, after all. I also don't see any number, and it didn't show in the main text that there was a footnote, probably because some of the text disappeared, maybe under the photo, but all of that could be me not being an expert. But you know what? I'm pretty sure that I am no more stupid than any other normal user out there trying to make things work. It just doesn't seem to work.

[ Update: Here's a quote from Microsoft's press release a year ago in May, when it announced SP2:

Microsoft recognizes that customers care most about real-world interoperability in the marketplace, so the company is committed to continuing to engage the IT community to achieve that goal when it comes to document format standards. It will work with the Interoperability Executive Customer Council and other customers to identify the areas where document format interoperability matters most, and then collaborate with other vendors to achieve interoperability between their implementations of the formats that customers are using today.
Promises, promises. Real-world interoperability. In my lifetime, you think? - End update.]

[Update 2: Microsoft's Doug Mahue tells us the meaning of the warning message when you save as .odt like this:

Iíll get a message warning me that my document may contain features that arenít compatible with this format, because ODF canít represent 100% of the things we can do in Word.
Heh heh. And evidently vice versa. - End update 2.]

Update 3, Wednesday May 6: Unbelievably, Microsoft's Gray Knoulton has posted a response, Rethinking ODF Leadership, suggesting that Weir be replaced as co-Chair of the ODF Technical Committee because of his article:

I'm not saying Microsoft (or anyone) should be the chair instead, but I am saying that Rob is unfit as a leader given his inability to separate his personal venom from his role as a leader in driving the standard forward. It seems like a better approach to empower people on the ODF TC who have a long-term view of the need to enable interoperability, and to move those with more short-term vendor-oriented agendas to the side.
And so the Microsoft effort to control ODF continues. Unbelievable reaction. Why don't they just fix their software? Comments here, including Weir's response. - End update.]

The chart is far more vivid on Weir's blog than here, because he makes the entire cell red, not just the word 'Fail'. This will certainly give you the big picture, though, and I want a permanent record of this information in Groklaw's archives.

****************************

Update on ODF Spreadsheet Interoperability

~ Rob Weir, An Antic Disposition

A couple of months ago I did a blog post about interoperability of ODF spreadsheets, the theory and practice of it. In that post I looked at the then current ODF implementations, including:

  1. Microsoft Office 2003 with the Microsoft-sponsored CleverAge Add-in version 2.5
  2. Google Spreadsheets
  3. KOffice's KSpread 1.6.3
  4. Lotus Symphony 1.1
  5. OpenOffice 2.4
  6. Microsoft Office 2003 with Sun's ODF Plugin
I showed what worked, what didn't and made some suggestions on how this could be improved. Essentially, I found only two notable failures, when the CleverAge Add-in for Excel loaded KSpread and Symphony documents. The other scenarios I tested were OK:



Created In






CleverAge
Google
KSpread
Symphony
OpenOffice
Sun Plugin

Read In


CleverAgeOK
OK
Fail
Fail
OK
OK

GoogleOK
OK
OK
OK
OK
OK

KSpreadOK
OK
OK
OK
OK
OK


SymphonyOK
OK
OK
OK
OK
OK

OpenOfficeOK
OK
OK
OK
OK
OK

Sun PluginOK
OK
OK
OK
OK
OK



I lot has happened in the past two months. Since I did that original analysis, we've seen the following applications updated:

  • CleverAge has released version 3.0 of their Add-in for Excel.
  • OpenOffice 3.01 is now out and in wide use
  • Symphony 1.3 is in beta
  • The Sun Plugin is now at version 3.0
  • Microsoft Office 2007 SP2 has been released, with integrated ODF support
  • KOffice 2.0 RC 1 is now out
I haven't been able to get the release candidate of KOffice installed, so I'm still including KSpread 1.6.3 in my tests, but for the rest I have created new test files in each editing environment, saved them to ODF format and then loaded the resulting documents into each of the other editors. So 42 different test scenarios. First I'll tell what I tested, then give you the table of results, and finally some observations and recommendations.

The test scenario I used was a simple wedding planner for a Maya, who is getting married on August 15th. She wants to track how many days are left until the wedding, as well as track a simple ledger. Nothing complicated here. I created this spreadsheet from scratch in each of the editors, by performing the following steps:

  • Enter the title in A1 "May's Wedding Planner" and increased font size to 14 point.
  • Enter formula = today() in A3 and set US style date format if necessary
  • Enter the date of the wedding as a constant in cell A4, also setting date format if necessary.
  • Simple calculations on cells A6-A8, to calculate days, weeks and months until the wedding
  • A11 through E16 is a simple ledger of the kind that is done thousands of times a day by spreasheet users everywhere. Once you have the formula set up in column E (Balance= previous balance + credits - debits) then you can add more rows and simply copy down the formula to the new row.
The resulting spreadsheet looks something like this:




Feel free to download a zip of all of the spreadsheet files. The file names should be self-explanatory.

Here is what I found when I tested the various scenarios:



Created In







Google
KSpread
Symphony
OpenOffice
Sun Plugin
CleverAge
MS Office 2007 SP2

Read In


GoogleOK
OK
OK
OK
Fail
OK
Fail

KSpreadOK
OK
OK
Fail
Fail
OK
Fail

SymphonyOK
OK
OK
OK
OK
Fail
Fail


OpenOfficeOK
OK
OK
OK
OK
OK
Fail

Sun Plugin
OK
OK
OK
OK
OK
OK
Fail

CleverAge Plugin
OK
OK
OK
OK
Fail
OK
OK

MS Office 2007 SP2
Fail
Fail
Fail
Fail
Fail
Fail
OK


So what is happening here?

CleverAge appears to have heeded the advice from my earlier blog post and now correctly processes KSpread and Symphony spreadsheets. This is great news and they deserve credit for that work. But this is a small bit of good news in a table that now shows awful lot of red. Let's see if we can figure this out.

First, some combinations that worked previously, when I tested two months ago, are now not working.

  • Symphony 1.3 beta hangs when loading the spreadsheet created with the CleverAge 3.0 ODF Add-in. Symphony 1.1 also hangs when loading that same spreadsheet, but both versions of Symphony work fine when loading the CleverAge 2.5 spreadsheet . The CleverAge document appears to be valid, so my guess is this is a bug in the Symphony 1.3 beta. I'll pass this document on to the Symphony development team.
  • KSpread 1.6.3 fails to process OpenOffice 3.01 documents. KSpread had no problems with OO 2.4 documents. The problem appears to be that OpenOffice 3.01, by default, writes out documents according to the ODF 1.2 draft which puts formulas in the OpenFormula namespace. But KSpread is expecting them in the legacy namespace. So formulas are dropped.
  • In a similar way, Sun's new ODF Plugin writes out documents according to the ODF 1.2 draft. KOffice is unable to handle these files. This also causes problems for Google Spreadsheets and the Microsoft/CleverAge Plugin for Excel, which report errors "We were unable to upload this document" and "The converter failed to open this file".
The new entry to the mix in this round is Office 2007 SP2, which has added integrated ODF support. Unfortunately this support did not fare well in my tests. The problem appears to be how it treats spreadsheet formulas in ODF documents. When reading an ODF document, Excel SP2 silently strips out formulas. What is left is the last value that cell had, when previously saved.

This can cause subtle and not so subtle errors and data loss. For example, in the test document I presented above, the current date is encoded using the TODAY() spreadsheet function. If the formulas are stripped, then this cell no longer updates and will return the wrong value. Similarly, if Maya tries to continue her ledger of expenses by copying the formula cells from column E down a row, this will cause incorrect answers, since there is no longer a formula there. In general, SP2 converts a spreadsheet into a "table of numbers" and any calculation dependencies are lost.

In the other direction, when writing out spreadsheets in ODF format, Excel 2007 SP2 does include spreadsheet formulas but places them into an Excel namespace. Not every ODF application checks the namespace of formulas, but the ones that do reject the SP2 document altogether. And the ones that do not check the namespace try and fail to load a formula that is syntactically different than what they expected and they essentially display a corrupted document. For example, a SP2 document, loaded in MS Office using the Sun Plugin looks like this:




Similar corruption occurs when loading the Excel 2007 SP2 spreadsheet into KSpread, Symphony and OpenOffice. Google doesn't import the document at all.

I must admit that I'm disappointed by these results. This is not a step forward. This is a step backwards compared to where we were two months ago. Spreadsheet interoperability is not hard. This is not rocket science. Everyone knows what TODAY() means. Everyone knows what =A1+A2 means. To get this wrong requires more effort than getting it right. It is especially frustrating when we know that the underlying applications support the same fundamental formula language, or something very close to it, and are tripped up by lack of namespace coordination. Whether it is accidental or intentional I don't know or care. But I cannot fail to notice that the same application -- Microsoft Excel 2007 -- will process ODF spreadsheet documents without problems when loaded via the Sun or CleverAge plugins, but will miserably fail when using the "improved" integrated code in Office 2007 SP2. This ain't right.

There would be a lot less red on the above table if two simple changes were made:
  1. Sun should write out formulas in ODF 1.1 format, using the "oooc" namespace prefix that the other vendors are using. Remember, the other vendors are using that namespace specifically for compatibility with OO's ODF documents. This is the current convention. To unilaterally switch, without notice or coordination, to a new namespace, is not cool. When ODF 1.2 is an approved standard, then we all can go there in a coordinated fashion, to cause users minimal inconvenience. But the above table clearly shows the confusion that results if this move is not coordinated. I know OO 3.01 has an option to save in ODF 1.0/1.1 format. But this should setting should be the default, IMHO. I'm not sure if the Sun Plugin has a similar configuration option.
  2. In addition to writing out compatible formulas as per the above comments on the Sub Plugin, Microsoft should remove the code in SP2 that causes it to reject every other vendor's spreadsheet documents. It looks very bad to cause this much data loss and their intentions in this area could easily be misinterpreted. Give the user a warning if you need to, but let them have the choice.
Finally, let me try to anticipate some of the counter-arguments which might be raised to argue against interoperability, and debunk them.

First, we might hear that ODF 1.1 does not define spreadsheet formulas and therefore it is not necessary for one vendor to use the same formula language that other vendors use. This is certainly is true if your sole goal is to claim conformance. If your business model requires only conformance and no attempt at achieving interoperability, then I wish you well. But remember that conformance and interoperability are not mutually exclusive options. You can be conformant to the standard and also be interoperable, if you use the legacy formula namespace and syntax. So this isn't a question about conformance at all. It is a question about interoperability and whether you do what is necessary for that as well.

We might also hear concerns that supporting other vendors' ODF spreadsheet formulas cannot be done because this formula language is undocumented. Of course, the irony here is that the formula language used by OpenOffice (and by other vendors) is based on that used by Excel, which itself was not fully documented when OpenOffice implemented it. So an argument, by Microsoft, not to support that language because it is not documented is rather hypocritical. Also, the fact that the Microsoft/CleverAge add-in correctly reads and writes that legacy syntax shows not only that it can be done, but that Microsoft already has the code to do it. The inexplicable thing is why that code never made it into SP2.

We'll probably also hear that 100% compatibility with legacy documents is critical to Microsoft users and that it is dangerous to try to write Excel documents into compatible ODF formulas because there is no guarantees that OpenOffice will interpret them correctly and give the right answer. But we should note that fully-licensed Microsoft Office users have already been creating legacy documents in ODF format, using the Microsoft/CleverAge ODF Add-in. These paying users of Microsoft Office will now see their existing investment in ODF documents, created using Microsoft-sanctioned code, get corrupted when loaded in Excel 2007 SP2. That is the shocking thing here, the way in which users of the ODF Add-in are being sacrificed. Why are paying Microsoft customers who used ODF less important than Microsoft customers who used OOXML?

If you are cynical, you might observe that if SP2 allowed Microsoft/CleverAge ODF Add-in formulas to work correctly, then SP2 would need to allow all vendor's formulas to work, since the other vendors are using the same namespace. The only way for Microsoft to make their legacy ODF documents work and to exclude other vendors would be to specifically look in the document for the name of the application that created the document, and allow their ODF Add-in but reject OpenOffice, etc. IANAL, but I think you would agree that something like that would look very, very bad to competition authorities. So the only way out, if your goal is to hide from interoperability, is to sacrifice your Office customers who are using the Microosft/CleverAge ODF Add-in. It serves them right for not sticking to the party line in the first place. This'll teach 'em good.

Of course, I am not that cynical. I was taught to never assume malice where incompetence would be the simpler explanation. But the degree of incompetence needed to explain SP2's poor ODF support leads me to further uncharitable thoughts. So I must stop here.

As I mentioned before, this is a step backwards. But it is just one step on the journey. Let's look forward. This is just code. Code can be fixed. We know exactly what is needed to have good interoperability of spreadsheet formulas. In fact most of the code already exists for this. The only thing we need now is to actually go do it and not get too far ahead, or lag too far behind. This is more a question of timing and coordination than hard technical problems.

  View Printable Version


Groklaw © Copyright 2003-2013 Pamela Jones.
All trademarks and copyrights on this page are owned by their respective owners.
Comments are owned by the individual posters.

PJ's articles are licensed under a Creative Commons License. ( Details )