opendocument-users message

Subject: RE: [opendocument-users] RE: Foreign elements and attributes
From: jose lorenzo <hozelda@yahoo.com>
To: Alex Brown <alexb@griffinbrown.co.uk>
Date: Sat, 7 Mar 2009 14:47:35 -0800 (PST)

--- On Wed, 3/4/09, Alex Brown <alexb@griffinbrown.co.uk> wrote:

> From: Alex Brown <alexb@griffinbrown.co.uk>
> Subject: RE: [opendocument-users] RE: Foreign elements and attributes
> To: "Jesper Lund Stocholm" <4a4553504552@gmail.com>
> Cc: "ODF Users List" <opendocument-users@lists.oasis-open.org>
> Date: Wednesday, March 4, 2009, 12:36 PM
> Jesper, all,
...
> The fact is that an ODF document that conforms to this new
> restricted
> conformance class is, practically, just as lousy a basis
> for conformance
> as existing ODF documents. 
> 
> There are, I think, three main reasons for this (in
> ascending order of
> seriousness):
> 
> (1) There are always techniques for abusing an XML format;
> processing
> instructions, comments, base64 content in metadata, etc.
> People are
> endlessly creative in such matters.

I agree.

> (2) ODF *still* defines no scripting language, so any
> high-value
> document which uses scripting isn't going to be
> interoperable (also, one

A reduced conformance class might be one without scripting or with a very reduced set of scripting rules [something I haven't thought about but should be possible if we add enough restrictions to the language; however, it may not be desirable to add scripting to a strict profile or maybe scripting might best be done through xml]. The goal would be to come up with a spec where -- on paper -- interoperability would be virtually guaranteed.

> (3) The "pure" conformance class allows implementations to
> ignore
> features: a conforming consumer "need not interpret the

I'm just tuning in, but has it been suggested to have subsets of the strict conformance class? This way a calculator can very adequately deal with only a special subset of ODF and a writer app likewise with a different subset.

I think the most useful scenario is to have multiple strict conformance classes (from subsets of ODF): very well defined mandatory rules for subsets instead of leaving things in the air allowing apps to ignore tags and attributes.

You would want an app that spoke only "math" or "svg" to create a sample of this specialized content, then have a full-featured ODF app modify it but using only "math" or "svg" or whatever. Round-tripping is improved if instead of leaving too many unknowns, we have specialized conformance levels. We can think of the subset profiles as hints to apps that can handle it and more.

An example of a problem we find if we don't have well-defined subset (strict) conformance classes but instead have a single large (strict) conformance class and allow ignoring of tags: A specialized app creates a subset. A full-featured app arbitrarily adds elements that won't be understood in addition to something that will. Ex: it adds a font tag or even a plain span tag around some newly added text, where the other app interacting with this document regularly doesn't understand the font or span tag but would allow text without these tags. That limited app might discard the text, or perhaps ignore that unknown tag but keep the text, or maybe drop the parent+children where that unknown tag was found as a child, or even crash, etc. In short, I doubt ODF is defined such that there is a bijection from the set of distinct semantic possibilities to an ODF document tree (in canonical form or not); thus, we maximize the odd of interoperability among a wider
 range of applications by having several tightly-defined subset (strict) conformance classes.

The list of these official subset extension classes can also grow over time. We start off with the most important subclasses and go forward. We go forward accepting input and submissions from the public. Some smaller subsets might be submitted in short time and come in very handy to certain industries. [Note, in this paragraph and perhaps others, "we" technically doesn't include me, but we get the point.]

Also, we may want to start things off by writing up a single strict profile but allow it to come in two flavors (based on some attribute value): (A) one where you can ignore subcomponents and (B) the other where you can't. This way, all reduced-feature apps can immediately start playing with version B while remaining "strictly conformant" in the B sense. [It should also become obvious soon enough if this system of ignoring tags/attributes is at all likely to lead to interoperability of any kind.]

We can also plan for "multi-part" documents later on that will be able to contain sections that adhere to specific conformance levels, yet instead of having the outside shell simply wrap these parts and treat them as opaque children (as CDF might do), the subparts would all be subsets of strict ODF which already defines the interactions of these subparts. These subsections would be identifiable through attributes at their top level (or create a wrapper element as an alternative; however, I think for aesthetic reasons this might be a case where you want to "adorn" with an attribute that could be added to almost any element that might serve as the top level element for the subprofiled subsections). Smart applications will know how to "flatten" out these documents to some degree to deal with the user's new interop goals.

> So I think users need to understand, very clearly, that an
> ODF
> document/app of *either* conformance class has an EXTREMELY
> WEAK CLAIM
> TO INTEROPERABILITY. The "pure ODF conformance" sticker
> would be at best
> valueless and at worst positively misleading.
> 
> So what I'd like to see is some real effort from the TC
> going into
> resolving this problem ...

I don't know how the strict class was defined, but I do think you can specify the allowed behavior of all parts of an XML conformance class fairly precisely using English. It won't be perfect, of course, and I don't know enough to say whether I think ODF as currently defined is precise (consistent, etc) enough. Conformance tests, maintenance/errata, many official code examples, etc, would go a long way to helping everyone get on the same page.

Having well-defined rules means cheaters can eventually get caught and put on public trial. In particular, a vendor with a product that fails to produce conformant documents in the field will lose reputability.

Failing conformance can be because of bugs and not ill-will, but at least you have a yard stick. Too many bugs is bad for the consumer. The bottom line for the consumer wanting interoperability is to find out which are the apps that fail the least and most behave according to the sticker on the box.

Since conformance and interoperability are important attributes for many consumers, we should attempt to provide something, and not give up until it's shown fairly conclusively that the prescription we come up with won't fix all ailments [well, we can give up whenever we want, actually]. Perhaps we can in fact get very close to meeting these valuable goals.

As a side bonus, attempting this feat (of interop/ high level conformance) in earnest will help ferret out some types of problems with the ODF spec. Not to try means "interop" is likely to be much worse than we would initially hope.
Follow-Ups:
- RE: [opendocument-users] RE: Foreign elements and attributes
  - From: "Dennis E. Hamilton" <dennis.hamilton@acm.org>