OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

office-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: RE: [office-comment] ODF security hazard? (ODF all versions)

Funny, I walked into a conversation on the ODF TC list where it was
suggested that office-document processors implement custom XML processors
for performance reasons (and this was justification for using the schema to
limit where xml:id could appear rather than simply allowing it for whatever
reason folks wanted fragment IDs on elements in ODF documents).

Well, the interaction of charset values and the character set encoding used
in URIs and also the "full-path" strings in the manifest.xml, and the
(unspecified anywhere) encoding of the actual Zip content-item names is an
interesting case.  This is apparently going to be a problem in Asia, if any
non-US-ASCII, non-Unicode code points are actually used in "full-path"
strings and the corresponding Zip item names.  That's an interaction that
just came to my attention.  There appear to be more.

For example, I also wonder if it is permissible to have MIME type parameters
in manifest:media-type values, and whether the interactions of those with
the interpretation of XML documents are handled by implementations.

There are many cases where XML is constrained in an application.  SOAP
forbids DTDs as I recall.  I'd be shocked if it permitted processing
instructions.  I believe that OOXML restricts its internal XML documents to
UTF-8 and UTF-16 (and haven't checked about the BOM).  The encouraged (and
prevalent) use of media-type value "text/xml" also has some implications for
the XML, depending on whether one assumes this counts as a higher-level
protocol with regard to the XML document.

Nitty little details like that come to mind.  In my experience, it is good
to make normative declarations of the permitted variations.  Because an ODF
package is a kind of multi-part entity, what happens when there are
differences in the options applicable to the individual XML documents is
also rather interesting.  Also, because Zip package items are a rather
unique medium, there needs to be some account for how that figures in the
use of XML in the cases I mention and others that I am sure will arise.

And, as I said, these may not be pressing matters in practice simply because
ODF document producers operate inside of a particular well-behaved set of
variations.  If there were a normative set, it would relieve processor from
having to attend to more than those and producers would be constrained from
taking a flyer.

However, I think Postel's law is a great argument in favor of predictable,
permissive foreign-element handling.

 - Dennis

-----Original Message-----
From: robert_weir@us.ibm.com [mailto:robert_weir@us.ibm.com] 
Sent: Saturday, February 21, 2009 13:21
To: office-comment@lists.oasis-open.org
Subject: RE: [office-comment] ODF security hazard? (ODF all versions)

"Dennis E. Hamilton" <dennis.hamilton@acm.org> wrote on 02/21/2009 
03:39:42 PM:

> The bigger issue seems to be the failure of ODF to profile its normative
> dependence on XML 1.0 at all.  So, at the moment, all XML 1.0 
> cases have to be tolerated (e.g., charsets, parameters, prolog omission,
> MIME Type interaction, entity definitions, DTD occurrences, processing
> instructions, and for all I know, byte-order marks).  I assume that the
> various normative statements on XML processors apply as well, though I 
> know about any MAYs and SHOULDs (and whose definition of such 
> language is being used, in contrast to the ISO usage in those ODF
> specifications starting with IS 26300:2006).

No doubt XML has its quirks, but these are well known and have been around 
for a decade.  XML parsers handle them.  In practice XML 1.0 is one of the 
most widely deployed and interoperable standards in use today.  No ODF 
implementer sits down and writes an XML parser from scratch.  They use an 
off-the-shelf one, typically one that comes with their programming 
environment or on their deployment platform. 

So I don't see the need to profile XML.  It is odd that you would see this 
as a failure.  Did you have a few examples in mind where this has been 
done with OASIS or W3C standards, where they forbid the use of core XML 
features such as processing instructions?  Was this done in constrained 
environments, like mobile devices, where they needed a smaller subset of 

Certainly Postel's robustness principle is good advice for implementors: "
Be conservative in what you do; be liberal in what you accept from 
others.".  Conservative output should shun the edge features of XML and 
stick to the core.  But an ODF processor should be prepared to handle 
anything that a conformant XML document can throw at them.  In fact, any 
reasonable ODF processor should be robust enough to handle non conformant 
ODF documents as well, including potentially invalid instances.  This is 
more reasonable than failing or giving a user a cryptic error message. 


This publicly archived list offers a means to provide input to the
OASIS Open Document Format for Office Applications (OpenDocument) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: office-comment-subscribe@lists.oasis-open.org
Unsubscribe: office-comment-unsubscribe@lists.oasis-open.org
List help: office-comment-help@lists.oasis-open.org
List archive: http://lists.oasis-open.org/archives/office-comment/
Feedback License: http://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: http://www.oasis-open.org/maillists/guidelines.php
Committee: http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=office

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]