[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: normalizing / cleaning up ODF files for use in revision control
Hello all, As I mentioned in the last call, I'm working on code to clean up ODF documents. The goal is normalize documents such that documents with the same content will have mostly the same bytes. ODF documents can have meaningless differences. Here is a small selection of this that may vary across documents without affecting the actual content. - names of automatic styles and fonts - order of styles and fonts - unneeded automatic styles (styles that change nothing with respect to the parent style) - whitespace - chosen units for lengths (e.g. in, pt, cm) - xml namespace prefixes The code that removes many of these differences is hosted here: https://gitlab.com/odfplugfest/odfhistory/ You can browse the diffs between version of the specification here: https://gitlab.com/odfplugfest/odfspecifcationhistory/ As the code in odfhistory improves, I will update the example in odfspecifcationhistory. The Relax NG files are not included yet. If you want to contribute improvements to this code to fix a particular issue, you are welcome. The code works with zipped and with flat ODF documents. Best regards, Jos
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]