office message

Subject: Re: [office] Conformance Clauses Proposal
From: Michael Brauer - Sun Germany - ham02 - Hamburg <Michael.Brauer@Sun.COM>
To: robert_weir@us.ibm.com
Date: Fri, 21 Nov 2008 13:54:43 +0100
Hi Rob,

thank you very much for your feedback. It took me unfortunately a little
bit longer to respond than I hoped it would.

On 11/17/08 14:14, robert_weir@us.ibm.com wrote:
> I apologize for not reviewing this earlier.  But since I woke up this 
> morning at 4am, due to jetlag, I find myself free of meetings for a few 
> hours, so I will make some comments now.  Overall I like the progress over 
> ODF 1.0/1.1.  This is an important part of the standard, and will get a 
> lot of attention during the review in OASIS, as well as in JTC1 when we 
> submit ODF 1.2 for PAS approval.  So I'm hoping we can consider my late 
> comments and perhaps make another iteration over this section.
> 
> I reviewed the "sixth iteration" dated 11/11/2008.
> 
> I think it would be good if we started the conformance section with a 
> clear statement of what we will be defining.  Something like, "The 
> OpenDocument Format standard defines conformance for documents, generators 
> and processors, with two conformance classes called loose and strict." 
> We'll need something like that for the Scope statement as well.  Probably 
> want to synch up on the language.

Yes, this sounds like a good idea.

> 
> I'm a little confused by the term "processor", since it is defined, 
> somewhat circularly, as "a program that can parse and process OpenDocument 
> document".  What is "to process"?  What is intended, beyond parsing? 
> 
> To my thinking, we have three kinds of applications:
> 
> 1) generators (or writers or producers)
> 2) parsers (or readers or consumers)

This is what is meant by the term "processor".
What about using the term interpreter rather than parser?
SVG uses this term:

http://www.w3.org/TR/SVG11/conform.html#ConformingSVGInterpreters

The conformance clauses for SVG interpreters state:

"A Conforming SVG Interpreter must parse any SVG document correctly. It
is not required to interpret the semantics of all features correctly."

So, this is actually close to what is called an "ODF processor" in the
current proposal.


> 3) processors (that both read and write)
> 
> I think we may be failing to acknowledge that 3rd kind of program, the 
> ones that both read and write.  As we know, this brings in additional 
> questions related to round-tripping, and that this is a thorny issue.  But 
> it is a legitimate concern, and we should see if there is some language 
> the TC can agree to for it, especially considering that most ODF 
> applications fit into that category, and in practice interoperability 
> would be enhanced by any firmer guidance ODF 1.2 can provide in this area.

I agree. But the requirements for applications that read and write
documents heavily depend on the purpose of the application itself. We 
also have to consider that between a read and a write some or all of the 
information that is contained in the document changes, but that we don't 
know which information and why it is changed. That makes it very
difficult to find requirements that are testable.

I could actually imagine that this kind of conformance could be much
better addressed by the OIC TC, where we may bind the requirements to
profiles.

> 
> Here are some specific comments:
> 
> Document processing -- 
> 
> "Documents that loosely conform to the OpenDocument specification..."
> 
> We introduce here the term "loosely conform" but we don't clearly indicate 
> what our conformance classes are.  For example, do we want "loose 
> conformance" and "strict conformance"?  or plain "conformance"?  Is there 
> a better word than "loose"?  It carries a negative connotation in my ears. 

You are right that we use the term "loosely conform" before we define
it. This could be solved by moving the document processing section
behind the conformance clauses.

The two proposed conformance classes are "conformance" and "loose 
conformance", but I'm open for other suggestion how to name them. I have 
chosen the term "conformance" to emphasize that this is actually the 
type of conformance that applications should aim to achieve. We may also 
call the classes "strict conformance" and "loose conformance". I'm not 
in favor of calling them "strict conformance" and "conformance", because 
this may sound like that "conformance" is what applications should 
achieve, while "strict conforamance" contains some optional requirements 
only that may or may not be achieved. That's not what was intended by 
having two conformance classes.
> 
> Also, do we need loose conformance at all?  Why not just have a single 
> conformance class and anything beyond that is not conformant?  Then, if 
> there occurs a commonly-used set of extensions to ODF, in a foreign 
> namespace, then we can either included that in ODF-Next, or (more simply) 
> define a profile for the extensions.

I have thought about this myself for a long time. One reason why I have
added the "loose" conformance class was that we allowed foreign elements
in ODF 1.0 and 1.1. The other reason was that vendors may have
the issue that they have to store some information in a file
immediately, for instance because a customer requested this, and cannot
wait for the next ODF version. The assumption here of cause would be
that they propose the extension to the ODF TC, so that it would be used 
only temporarily. In so far, there may be indeed better solutions than a 
separate conformance class. I will think about this.

What seems to be essential here is that we define rules how foreign
element and attributes are processed. This means, we may remove the
loose conformance class, but we must not remove the processing rules for
foreign elements and attributes.

In any case, I think we should remove the "loosely conformant
OpenDocument generator" again. An applications that calls itself 
conformant shall be able to store conforming file.


> 
> "Foreign elements and attributes shall not be part of a namespace that is 
> defined within this specification."
> 
> "part of a namespace" is rather loose.  The term used by the Namespaces in 
> XML Recommendation is "associate".  So I suggest, "Foreign elements and 
> attributes shall not be associated with a namespace that is defined within 
> this specification."

Yes, that's a good suggestion.

> 
> "If a foreign element has a  or  ancestor element and is a child elements 
> of an element which may include..."   Typo.  Should be "child element" 
> (singular).  Or do we really mean "descendant element"?

It is a typo. And "child element" is correct. The case this addresses is:

<text:p>text
   <text:frame>
     <text:image ...>
     <text:new-replacement>My new replacement</text:new-replacement>
   </text:frame>
</text:p>

In this case, "My new replacement" should not be displayed because it
belongs to a text frame (that is displayed outside the text flow) rather
than to the text of the paragraph.

With the current wording, it would not be displayed, because 
<text:frame> does not allow text content. If we would say "descendant", 
then the text would be displayed because <text:p> allows text content.

> 
> "For foreign elements that occur at other locations, conforming processors 
> should not process the element's content, but may only preserve its 
> content"
> 
> We have not defined "process".   Does it include "preserve its content"?

No, it means that the content should not be displayed, analyzed or
whatever the application does. I think the term "interpret" is better here.

> 
> Also, why not make preservation of content of foreign elements be a 
> "should"? Is there any reason why we would not make that recommendation?

I assume you mean the case that the content of a foreign element should 
be processed (or interpreted)? In that case, an application would 
interpret the text content anyway, and where would not be any difference 
to text that is contained in a paragraph but outside a foreign element.

> 
> In any case this "Document processing" section seems misplaced.  I wonder 
> if it fits better if put in the Generator or Processor conformance 
> section, since it is defining conformance.

Well, maybe. But this would make the conformance clauses even more
difficult to read. I will update the proposal based on your other 
suggestions, and we may see how it reads then.

> 
> So overall, I find this document processing area thorny and troublesome. I 
> would not be disappointed if it were entirely removed from the standard, 
> or moved to an informative annex on "How to extend ODF".  There is little 
> value to implementors or users in having a conformance class for documents 
> that are extended in nearly-arbitrary ways.  Nothing written here really 
> allows such extended documents to interoperate.  On the one hand, I don't 
> believe that there is anything intrinsically evil with extensions, but I 
> don't think that we need to favor them with the label "loose conformance". 

We have to differ between the processing rules and the "loose
conformance class". The processing rules are something we should have.
The "loose conformance" class is maybe something we don't need.

>  
> 
> Do we know what implementations today use foreign elements and attributes 
> according to this section?  Is the usage widespread?
> 
> "Conforming OpenDocument Documents" -- this is defined currently as a 
> condition, if/then.  But I think it should be stated as a requirement: "A 
> conforming OpenDocument document shall adhere to the specification 
> described in this document...".  Similarly, "An OpenDocument document 
> which is also an OpenDocument Package shall...". 
> 
> Generally, we should use "conform" rather than "adhere" whenever possible.

I have used the SVG conformance definition as basis here, but we may
reword this.

> 
> Do we want to require a specific XML character encoding? 

XML requires UTF-8 and UTF-16. I would not extend this.

> 
> 
> 2.1.3 -- "f the XML root is.... then it  shall be valid with respect to 
> the strict schema defined by this specification."  What is "it"?  "XML 
> root" doesn't make sense?   "Sub document" maybe?  A similar question in 
> 2.1.4.

A link to the schemas is indeed missing.

"XML root" shall actually read "XML root element".

> 
> "Conforming OpenDocument Generators" has the first conformance requirement 
> stated as "It shall not  create any non-conforming OpenDocument document 
> of any kind."  But this, stated as a negative, is untestable.  How can an 
> implementation prove that it is incapable of producing a non-conforming 
> ODF document?  What, for example, if the power goes out when in the middle 
> of saving a document?  Would that render the entire application 
> non-conforming?

That is a good point. Again I have adopted that from the SVG conformance
clauses. The problem if we don't have this, then an application that
creates one conforming ODF document and otherwise only documents that
contain extensions may call itself conformant, too.

> 
> I'd state this requirement more simply (and more testable) as "It shall 
> produce documents which conform to this standard".
> 
> 
> We might factor out the common requirements between normal and loose 
> conformance rather than repeating material.  For example, we are currently 
> stating the documentation requirement twice.

I've noticed this than adding the documentation requirement, too, but
repeated the text because I otherwise had to change the structure of the
clauses.

> 
> My preference would be to make the document requirement be a "shall" 
> rather than a "should".  I think we're giving implementors a lot of rope 
> to play with by allowing a loose conformance class, a significant ability 
> to extend the standard in incompatible and proprietary way.  As stated 
> before, I'd be happy if this ability were removed altogether.  But if we 
> do allow it the label of "conforming" than I think we should require 
> documentation of the extensions, not merely recommend it.

If we keep the loose conformance, then I have no objections in turning
this into a shall.

> 
> "Conforming OpenDocument Processors" -- "process" is not defined.

We should use "interpreter" here, too

> "It  be able to parse and process OpenDocument documents of one or more of 
> the defined document types (defined by their MIME types) any of which are 
> represented in packages."   I don't think we need to mention MIME types 
> here.  We can just say "able to parse and process OpenDocument packages of 
> one or more of the document types defined by this standard".  Better even 
> if we can give a section reference.

Yes, this sounds reasonable.

> Why is the single XML version a "may" rather than at least a "should"?  It 
> is odd to have the single file version be present (and not deprecated) if 
> we do not feel it warrants more than a "may" for support.

If we make this a should, then application actually should implement
both variants. That is some effort, and I think for most
application this is not reasonable. On the other hand, there are 
situations where providing a single XML file is reasonable, for instance 
if it should be taken as basis of an XSLT transformation.

Best regards

Michael
> 
> Regards,
> 
> -Rob
> 
> ---------------------------------------------------------------------
> To unsubscribe from this mail list, you must leave the OASIS TC that
> generates this mail.  Follow this link to all your TCs in OASIS at:
> https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php 
> 


-- 
Michael Brauer, Technical Architect Software Engineering
StarOffice/OpenOffice.org
Sun Microsystems GmbH             Nagelsweg 55
D-20097 Hamburg, Germany          michael.brauer@sun.com
http://sun.com/staroffice         +49 40 23646 500
http://blogs.sun.com/GullFOSS

Sitz der Gesellschaft: Sun Microsystems GmbH, Sonnenallee 1,
	   D-85551 Kirchheim-Heimstetten
Amtsgericht Muenchen: HRB 161028
Geschaeftsfuehrer: Thomas Schroeder, Wolfgang Engels, Dr. Roland Boemer
Vorsitzender des Aufsichtsrates: Martin Haering
Follow-Ups:
- Re: [office] Conformance Clauses Proposal
  - From: robert_weir@us.ibm.com
- Re: [office] Conformance Clauses Proposal
  - From: "Dave Pawson" <dave.pawson@gmail.com>
References:
- Conformance Clauses Proposal v4
  - From: Michael Brauer - Sun Germany - ham02 - Hamburg <Michael.Brauer@Sun.COM>
- Conformance Clauses Proposal
  - From: robert_weir@us.ibm.com