OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

office message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: normalizing / cleaning up ODF files for use in revision control

Hello all,

As I mentioned in the last call, I'm working on code to clean up ODF 
documents. The goal is normalize documents such that documents with the same 
content will have mostly the same bytes.

ODF documents can have meaningless differences. Here is a small selection of 
this that may vary across documents without affecting the actual content.

 - names of automatic styles and fonts
 - order of styles and fonts
 - unneeded automatic styles (styles that change nothing with respect to the 
parent style)
 - whitespace
 - chosen units for lengths (e.g. in, pt, cm)
 - xml namespace prefixes

The code that removes many of these differences is hosted here:

You can browse the diffs between version of the specification here:

As the code in odfhistory improves, I will update the example in 
odfspecifcationhistory. The Relax NG files are not included yet.

If you want to contribute improvements to this code to fix a particular issue, 
you are welcome. The code works with zipped and with flat ODF documents.

Best regards,

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]