OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

office message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: RE: [office] Conformance Clause proposal, Version 8


Rob,

Your particular examples here all strike me as having to do with the current
problematic statements about "preservation" of an unsupported/unknown
foreign element-attribute-value.  My presumption is that anything that
involves not knowing what to do with the foreigner under an operation means
the extension has to go away (e.g., begin and end tags disappear and the
content that is left is treated the same way by recursion).  Copy and paste
and not knowing if an attribute value is of type ID (or IDREF) is a great
example.   There is also the lifting out of an xmlns scope to be dealt with
under movement and under removal of unknown surrounds.

I'm all for not preserving stuff that is unsupported/unknown, and also not
preserving it, even when supported, as part of the operation of a
strictly-compliant producer.

With regard to ID type values, one would hope that xml:id is always used
when reference by fragment identifier is required and that other
interdependencies are handled by other than ID and IDREF type values.
(Although not material here, I wonder if the practice of using local names
id, ID, and maybe even ref came out of wanting to be able to know about the
ID and IDREF and IDREFS types without being in possession of the schema or
having access to even a DTD?)

It is also not possible to know that there are other unknown dependencies
between material that is touched in a foreign-erasing way and material in
other parts of the XML document that are not being touched.  That the
dependency will be broken strikes me as something that those who introduce
foreign e-a-v had better be prepared for.  (My example about layout hints in
my note to Bob is fraught with that.)

I think this is a great example of the care that is required for dealing
with foreigners (we need an internationally-inoffensive term for this) that
are not supported/understood, and implementers of foreigners need to be
prepared to encounter the consequence of such action in documents they
receive (back).

I will agree that some arbitrary extensions will be stupid.  Anyone who
introduces foreigners that are intended to be found in interchange needs a
good understanding of the likely treatment and the consequent mangling that
may be found in document that have recognizable foreigners.

I think your points about the hazards are all good ones and I think the
limitations on "preservation" need to be established better.  I don't have a
picture of how to handle that in the specification.  Maybe a technical note
on limits of preservation for foreign elements-attributes-values would be
better, especially since the limitations already apply to ODF 1.0/IS
26300/ODF 1.1 too.

 - Dennis


-----Original Message-----
From: robert_weir@us.ibm.com [mailto:robert_weir@us.ibm.com] 
http://lists.oasis-open.org/archives/office/200902/msg00070.html
Sent: Thursday, February 05, 2009 16:22
To: office@lists.oasis-open.org
Subject: RE: [office] Conformance Clause proposal, Version 8

Dennis,  my critique is a bit more subtle than that.  The point is that 
extensions cannot be generically processed and that makes many common 
editing and document manipulations unsafe.

Consider some basic ODF:

<text:p text:style-name=?default?>Hello World</text:p>

Now suppose I am an ODF processor, say an editor and I want to split this 
into two.  Maybe the user wants to apply a different style to each word. 
Or I want to insert another word into this paragraph.  Doing all this is 
straightforward. 

Now suppose instead I have this:

<foo:bar my-attr="3221"><text:p text:style-name=?default?>Hello 
World</text:p></foo:bar>


What do I do, in an editor, if the user wants to remove one of those two 
words?  Or copy/paste both of them to another part of the document? Or 
split these two words into two different documents, or combine two 
documents with similar extensions?  What should I do? 

I the user does a copy/paste, do I end up with this?

<foo:bar my-attr="3221"><text:p 
text:style-name=?default?>Hello</text:p></foo:bar>
.
.
.
<foo:bar my-attr="3221"><text:p 
text:style-name=?default?>World</text:p></foo:bar>

That sounds reasonable, and is analogous to copying styled text.  But what 
if my-attr ended up being declared in the extension schema as an XML ID? 
I'd now have duplicate ID's,  Whoops.  Without knowing the extension, how 
do I know when to simply copy something versus when I need to generate a 
unique ID?  And that is the sample case.  More general cases will have 
multiple classes of reference semantics, possibly spanning multiple XML 
documents.

Without knowing the schema of the extensions, and the constraints 
expressed and implied I am at a loss.  Are there reference semantics?.  Is 
there an implied ordering?  Cardinality constraints?  I simply don't know. 
  You can't do generic editing of an unknown schema in a word processor. 
It just doesn't work.

So what happens in practice is one of two things:

A) The document extensions are understood by only the application that 
writes them, and other applications are forced to react by doing one of 
two things:

1) Automatically stripping out all extension markup just to be safe.  This 
obviously benefits no one, especially not the vendor trying to extend ODF, 
and certainly not the user who potentially suffers data loss.

2) Screwing up the document by trying to preserve extensions in a naive 
way, but actually messing up the integrity of the document by not 
understanding fully the constraints of the extensions schema.  This also 
is a disservice to the user.

or B) The document extensions are well-documented by the application 
extending the format, and other vendors implement the additional 
constraints required.  But in this case, if vendors agree on the semantics 
of an extension, then shouldn't this just be made part of the standard?

That is why arbitrary extensions are evil.  They create in general an 
interoperability problem that defies solution. That is why no ODF 
implementation today uses this feature.  And this is why I've recommended 
that it be removed from the standard.

-Rob

"Dennis E. Hamilton" <dennis.hamilton@acm.org> wrote on 02/05/2009 
06:10:08 PM:
http://lists.oasis-open.org/archives/office/200902/msg00065.html
[ ... ]



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]