The so called “open” office XML-based file format is now known as “Ecma Office Open XML” standard (ECMA-376 – be careful: a 47MB download). The new “standard” was approved in under a year, much quicker then any other standard passed by ECMA – possibly even when not factoring in its huge size: the Microsoft office XML specification is over 6000 pages in size – checkout this illuminating diagram (reproduced here without permission):
Which is really not that a big of a problem – ECMA is not considered a major player in the standardization market, are known for being push-overs, and its much more likely that they are just Microsoft’s whipping boys then that other incentives played a role here (ECMA are also the body that authorized the Microsoft .net architecture components as a series of standards). The big problem is that this “standard” is designed to be completely unimplementable for people outside Microsoft!
Most obviously for its sheer size – Bob Sutor has this to say about it:
do this little thought experiment: imagine how thick a ream, or 500 sheets of paper is. Double that to get the thickness of a thousand pages, make that 4 times thicker to see how thick 4000 pages is. That’s how many pages were in the last draft of the Open XML spec. How many people will you need to implement that fully and correctly, much less read it?
(The 4000 pages figure above is from a non-final version of the “specifications”)
But that is far from being the real problem with Microsoft’s markup specifications – the truth is far far worse: the “Open XML” format is nothing more then over 10 years worth of Microsoft Office format cruft dumped to XML format – down to the bug level! (or, as we say in the industry, a bug-for-bug compatible).
As Rob Weir details in his article, the “Open XML” specs are a detailed description of every tiny bug and every little incorrect behavior that exists today in Microsoft’s office products – which is now mandated as a formal specification – right down to specific cases where the “standard” mandate incorrect handling of data (recognized as incorrect by the text itself) in order to satisfy compatibility with a broken Microsoft implementation. This is the exact opposite of what a standard should be – a standard should dictate correct behavior, and if some implementation has a bug then that implementation has to fix it or work around it – it shouldn’t force new implementations to specifically write in bugs in order to be compatible!
As mentioned in so many places, any implementation of “Open XML” outside Microsoft will only serve to strengthen Microsoft choke hold on the office productivity suite market – not because it strengthen their position as the first implementation of the “standard”, but because that this specification can never be fully implemented outside Microsoft – as a result only Microsoft Office would be able to completely read and parse “Open XML” documents – other implementations at best can hope to be able to (only) write subset of the format, so in essence this file format will become a one way method of moving documents into Microsoft Office and not out to other implementations.
Novell – also known for their recent collaboration deal with Microsoft – have pledged to implement “Open XML” for the OpenOffice.org suite. I really hope that the OpenOffice.org maintainers would never ever accept the Novell implementation into the main OpenOffice.org application, as that would relinquish what ever success they had in usurping the incumbent office productivity suite.
More reading material here:
Rob Weir: A notable achievement
Bob Sutor: Is Open XML a one way specification for most people?
Bob Sutor: ECMA passage of Open XML was no surprise
Groklaw: Novell’s “Danaergeschenk”
Rob Weir: A Leap Back
Why OpenDocument Won (and Microsoft Office Open XML Didn’t)