Tuesday 25 September 2007

What’s the OpenDocument Format?

It used to be the case that whenever someone sent me a file from a PC it would be in a format that I didn’t have and I would have to run it through some other application before I could use it. Pain Shop Pro was brilliant because you could use it to convert so many picture file types from one to another. I have a piece of music software, dBpowerAMP, that can convert mp3 files to wav (or most other sound file types) and back again. And I still have Microsoft Office and IBM’s Lotus SmartSuite so I can open most word processor files people send me.

So why should a paragraph moaning about the diversity of file types be followed immediately by one suggesting that we should adopt yet another “standard” file type? Well it does sound a bit odd, but what I’d like to suggest is that we have a kind-of lingua franca file type – one that could be produced by any word processor and opened by any other word processing software. So, not quite such a silly idea!


Back in May, the International Organization for Standardization (ISO) approved the open source Open Document Format (ODF) as an international data format standard. And, although not usually thought of as a leader in IT, the Belgium government has instructed its government departments to use ODF for all internal communications. Similarly, the National Archives of Australia has decided to use OpenDocument for their cross-platform/application document format.


There is an Open Document Format Alliance, which is made up of a mixture of vendors and other organizations, and has around 140 members – probably more by the time you read this. It was developed by the OASIS industry consortium and is based on the XML format originally created by OpenOffice.org. IBM, Sun Microsystems, and Novell are keen promoters of the OpenDocument Format.


You might ask what’s the thinking behind ODF. Is it a way of getting back at Microsoft with its ubiquitous DOC format or Adobe’s PDF? No, there’s much more to it than that. The problem really first appeared when people tried to access older documents – not dusty scrolls tucked away at the back of ancient vaults – just documents that had been saved to disk using the word processor of choice some years ago and which couldn’t easily be read anymore. What happens to the files that were created using it, when you throw away old word processing software? And even if you kept the same product but have upgraded to the latest release, there’s always a chance that you used a feature that’s no longer supported – backward compatibility is a nightmare as more development work goes into a product. So, with ODF, you have a standard that works now, and, they predict, will work in a hundred years time.


The file extensions used for OpenDocument documents are ODT for text documents, ODS for spreadsheets, ODP for presentations, ODG for graphics (and there’s a proposed ODF for formulae).


Technically, an OpenDocument file can be either a simple XML file using as the root element, or a ZIP archive file comprising any number of files and directories. The ZIP-based format used in the main because it can contain binary content and, obviously, is much smaller.


Older version of Microsoft Office don’t support the standard, but apparently, new versions will. If you’d like software now that supports ODF, then there’s OpenOffice (http://www.openoffice.org/) and KOffice (http://www.koffice.org/). Go to http://odf-converter.sourceforge.net/ for an early version of Microsoft’s Open XML translator for Word.


So if you’re looking for file types that will work across platforms and across time, then ODF is what you need. If you’ve got millions of DOC and XLS files archived, then you either hope Microsoft’s Office product remains backward compatible forever and you never migrate from Windows or you have a lot of conversion work ahead of you! And with big names like IBM and Sun behind it and people like Google joining in, you know this standard isn’t going to suddenly disappear.

No comments: