OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

office message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: RE: [office] Conformance Clause proposal, Version 8


"Dennis E. Hamilton" <dennis.hamilton@acm.org> wrote on 02/05/2009 
08:22:21 PM:

> 
> Your particular examples here all strike me as having to do with the 
current
> problematic statements about "preservation" of an unsupported/unknown
> foreign element-attribute-value.  My presumption is that anything that
> involves not knowing what to do with the foreigner under an operation 
means
> the extension has to go away (e.g., begin and end tags disappear and the
> content that is left is treated the same way by recursion).  Copy and 
paste
> and not knowing if an attribute value is of type ID (or IDREF) is a 
great
> example.   There is also the lifting out of an xmlns scope to be dealt 
with
> under movement and under removal of unknown surrounds.
> 
> I'm all for not preserving stuff that is unsupported/unknown, and also 
not
> preserving it, even when supported, as part of the operation of a
> strictly-compliant producer.
> 

So if a consumer is just going to throw this information away, because it 
cannot understand the markup, nor even safely preserve it when the 
document is edited, then what purpose does it serve?  Certainly it can be 
useful if there is a private agreement between two parties on how that 
information should be processed.  But if it is private, then I don't think 
that has anything to do with a standard.  And if the parties want to make 
the interchange conventions be well-known, then they should be detailed in 
a standard.  I don't see any place for having "stuff" in ODF which the 
general ODF consumer can do nothing but toss out.


> With regard to ID type values, one would hope that xml:id is always used
> when reference by fragment identifier is required and that other
> interdependencies are handled by other than ID and IDREF type values.
> (Although not material here, I wonder if the practice of using local 
names
> id, ID, and maybe even ref came out of wanting to be able to know about 
the
> ID and IDREF and IDREFS types without being in possession of the schema 
or
> having access to even a DTD?)
> 

My mother used to say, "Wish in one hand and spit in the other and see 
which one has more".  Hoping that a producer will used xml:id is 
insufficient.  Hope has no place in a standard.  If conformance does not 
specifically require it, then an ODF consumer will necessarily assume that 
anything can and will happen in foreign markup, and the only thing they 
can then safely do at that time is strip it out.

> It is also not possible to know that there are other unknown 
dependencies
> between material that is touched in a foreign-erasing way and material 
in
> other parts of the XML document that are not being touched.  That the
> dependency will be broken strikes me as something that those who 
introduce
> foreign e-a-v had better be prepared for.  (My example about layout 
hints in
> my note to Bob is fraught with that.)

Of course, my issue is not so much with the producer of such content. They 
obviously know what the constraints are.  The issue is with any other 
consumer of the markup.

> 
> I think this is a great example of the care that is required for dealing
> with foreigners (we need an internationally-inoffensive term for this) 
that
> are not supported/understood, and implementers of foreigners need to be
> prepared to encounter the consequence of such action in documents they
> receive (back).
> 

Again my point is that there is NOTHING a consumer of ODF can do to handle 
the general case of foreign markup.  If we allow it in conformant 
documents, then ALL ODF consumers will either remove ALL foreign markup 
that they do not understand, or they will introduce corruption or data 
loss into such documents.  We can't just sit back and hope that ODF 
producers who extend ODF will get it "right" especially if we do not 
define in the standard what the requirements are for extending ODF safely. 
 

> I will agree that some arbitrary extensions will be stupid.  Anyone who
> introduces foreigners that are intended to be found in interchange needs 
a
> good understanding of the likely treatment and the consequent mangling 
that
> may be found in document that have recognizable foreigners.
> 

Again, the point is not that "some" extensions will be bad, but that the 
only logical thing for ALL ODF consumers to do is to reject ALL documents 
with such extensions.  You need to "fail safe" in this case.  If you are 
not able to distinguish safe extensions from unsafe ones, then you will 
reject them all, or at least strip out all of the foreign markup.

-Rob

> I think your points about the hazards are all good ones and I think the
> limitations on "preservation" need to be established better.  I don't 
have a
> picture of how to handle that in the specification.  Maybe a technical 
note
> on limits of preservation for foreign elements-attributes-values would 
be
> better, especially since the limitations already apply to ODF 1.0/IS
> 26300/ODF 1.1 too.
> 
>  - Dennis
> 
> 
> -----Original Message-----
> From: robert_weir@us.ibm.com [mailto:robert_weir@us.ibm.com] 
> http://lists.oasis-open.org/archives/office/200902/msg00070.html
> Sent: Thursday, February 05, 2009 16:22
> To: office@lists.oasis-open.org
> Subject: RE: [office] Conformance Clause proposal, Version 8
> 
> Dennis,  my critique is a bit more subtle than that.  The point is that 
> extensions cannot be generically processed and that makes many common 
> editing and document manipulations unsafe.
> 
> Consider some basic ODF:
> 
> <text:p text:style-name=?default?>Hello World</text:p>
> 
> Now suppose I am an ODF processor, say an editor and I want to split 
this 
> into two.  Maybe the user wants to apply a different style to each word. 

> Or I want to insert another word into this paragraph.  Doing all this is 

> straightforward. 
> 
> Now suppose instead I have this:
> 
> <foo:bar my-attr="3221"><text:p text:style-name=?default?>Hello 
> World</text:p></foo:bar>
> 
> 
> What do I do, in an editor, if the user wants to remove one of those two 

> words?  Or copy/paste both of them to another part of the document? Or 
> split these two words into two different documents, or combine two 
> documents with similar extensions?  What should I do? 
> 
> I the user does a copy/paste, do I end up with this?
> 
> <foo:bar my-attr="3221"><text:p 
> text:style-name=?default?>Hello</text:p></foo:bar>
> .
> .
> .
> <foo:bar my-attr="3221"><text:p 
> text:style-name=?default?>World</text:p></foo:bar>
> 
> That sounds reasonable, and is analogous to copying styled text.  But 
what 
> if my-attr ended up being declared in the extension schema as an XML ID? 

> I'd now have duplicate ID's,  Whoops.  Without knowing the extension, 
how 
> do I know when to simply copy something versus when I need to generate a 

> unique ID?  And that is the sample case.  More general cases will have 
> multiple classes of reference semantics, possibly spanning multiple XML 
> documents.
> 
> Without knowing the schema of the extensions, and the constraints 
> expressed and implied I am at a loss.  Are there reference semantics?. 
Is 
> there an implied ordering?  Cardinality constraints?  I simply don't 
know. 
>   You can't do generic editing of an unknown schema in a word processor. 

> It just doesn't work.
> 
> So what happens in practice is one of two things:
> 
> A) The document extensions are understood by only the application that 
> writes them, and other applications are forced to react by doing one of 
> two things:
> 
> 1) Automatically stripping out all extension markup just to be safe. 
This 
> obviously benefits no one, especially not the vendor trying to extend 
ODF, 
> and certainly not the user who potentially suffers data loss.
> 
> 2) Screwing up the document by trying to preserve extensions in a naive 
> way, but actually messing up the integrity of the document by not 
> understanding fully the constraints of the extensions schema.  This also 

> is a disservice to the user.
> 
> or B) The document extensions are well-documented by the application 
> extending the format, and other vendors implement the additional 
> constraints required.  But in this case, if vendors agree on the 
semantics 
> of an extension, then shouldn't this just be made part of the standard?
> 
> That is why arbitrary extensions are evil.  They create in general an 
> interoperability problem that defies solution. That is why no ODF 
> implementation today uses this feature.  And this is why I've 
recommended 
> that it be removed from the standard.
> 
> -Rob
> 
> "Dennis E. Hamilton" <dennis.hamilton@acm.org> wrote on 02/05/2009 
> 06:10:08 PM:
> http://lists.oasis-open.org/archives/office/200902/msg00065.html
> [ ... ]
> 



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]