[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: RE: [office-comment] shorter XML representations for the values ?
There are efforts for more-compact XML and even binary encodings of XML. (Just as ASN.1 has an XML expression, apparently the reverse is desired as well?) However, recall that for any substantial documents, the XML in ODF packages is compressed. While that is not a panacea, it makes the redundancies in text compressible. I, for one, think that matching the XML Schema datatypes is a great idea and I am a bit startled that it was not applied in this case. The Relax NG schema for ODF already appeals to those datatypes so I wonder how this was missed. So the change for Boolean would be achievable simply by altering the schema pattern <define name="boolean"> <choice> <value>true</value> <value>false</value> </choice> <define> to be <define name="boolean"> <data type="boolean"/> </define> The problem is with down-level implementations. That is, wanting ODF 1.2 consumers to be able to accept ODF 1.3 documents where there is no difference in the features being used. This also applies pretty much for ODF 1.1 consumers too, many of which remain in use to this day. (I have stopped being surprised that there are a large number of people still running OpenOffice.org 3.4.1, apparently an oldie but goodie, and wondering if it is safe for them to upgrade to one of the 4.x series on the computer they have been using all that time.) So the trade-off between compressibility and uncompressed compactness has to be looked at from an interoperability perspective with regard to legacy consumers on legacy computers [;<). By the way, if you want to make a bigger hit on uncompressed compactness, change the prefixes for namespace bindings from such things as "office", "text", "table", "draw", "presentation", "manifest" and such to "o", "t", "tb", "d", "p" and "m" or whatever in the XML that is produced. That can be done by a producer without requiring any change to the specification at all and consumers are expected to be fine with it already. As an experiment, see what difference that does or does not make on the compressed size of the XML too. Thanks for the thoughtful suggestion. I also wonder about the energy issue and use of standards at this level in additional comments below. -- replying below to -- From: Jérôme Bouat [mailto:jerome.bouat@wanadoo.fr] Sent: Saturday, January 17, 2015 06:52 To: office-comment@lists.oasis-open.org Subject: Re: [office-comment] shorter XML representations for the values ? Hello, > One solution for the boolean issue would be to harmonize > our office:value-type attribute with XML Schema datatypes, > at least for the common overlap in types. > XML Schema's boolean type allows lexical forms to be one of: >true, false, 1, 0. That would allow a more compact form. Do you know if the next specification will take this efficient boolean values representation into account ? [ ... ] I don't think that standardisation is the end of the innovation. If you have enough time to make the "off-the-shelf" tools compliant with the new specification, then a shorter representation of values would be a benefit for everyone. I don't think the binary encoding would be a solution for long term storage. As you said, XML provide benefits like validation, etc. I think a shorter value representation is a good trade-off between the use of the generic XML language/tools and the need of efficiency. If you think about the increasing cost of the energy, then a more compact XML would be a benefit in the data centers, desktop computers, etc. <orcmid> When I worked for Bob Bemer (aka the father of ASCII) back in the day, he was known for this interesting observation: "Standards are arbitrary solutions to recurring problems." The idea is not to then introduce new recurring problems. So innovation via hammering on a voluntary standard is not always a productive notion. These standards do *not* compel compliance. Purchasing requirements can, but I don't imaging we'll ever see preference for 0 and 1 over true and false being a "rider" in a procurement requirement for ODF- Compliant software. There are far bigger issues that don't get dealt with at procurement already. I have no idea how to address energy trade-offs by tweaking XML representations, but I think one would prefer to go after the big-win low-hanging fruit first. We may be micro-optimizing when macro-optimization has more to yield. I think we need more quantitative analysis. We've assumed, for example that compression and decompression save enough in input-output and storage to be important as a practical matter. But we can't neglect the cost attributable to breaking changes and the processing cost of compression/decompression (although I think modern processors provide assistance in this area). It would be interesting if there were some sort of energy-footprint, carbon trade-off standard that could be used for procurement. It would have to provide some quantitative metrics for determination of software and file-format compliance. I'm not confident that such a thing is feasible and I wonder about the wasted energy in attempting to specify such a thing [;<). I'm thinking that improved energy consumption and loss in processors, storage, and power management provide much greater advantage, at a faster pace, than what we are looking at in terms of XML. But as I say, there are ways to reduce the XML without breaking any standard-compliant down-level consumers. </orcmid> Regards. -- This publicly archived list offers a means to provide input to the OASIS Open Document Format for Office Applications (OpenDocument) TC. In order to verify user consent to the Feedback License terms and to minimize spam in the list archive, subscription is required before posting. Subscribe: office-comment-subscribe@lists.oasis-open.org Unsubscribe: office-comment-unsubscribe@lists.oasis-open.org List help: office-comment-help@lists.oasis-open.org List archive: http://lists.oasis-open.org/archives/office-comment/ Feedback License: http://www.oasis-open.org/who/ipr/feedback_license.pdf List Guidelines: http://www.oasis-open.org/maillists/guidelines.php Committee: http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=office Join OASIS: http://www.oasis-open.org/join/
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]