office message

Subject: Re: [office] The Rule of Least Power
From: robert_weir@us.ibm.com
To: office@lists.oasis-open.org
Date: Thu, 12 Feb 2009 11:39:12 -0500
Thomas Zander <T.Zander@nokia.com> wrote on 02/12/2009 09:24:36 AM:

> 
> On Thursday 12. February 2009 13:53:02 ext robert_weir@us.ibm.com wrote:
> > > That mail just posts some anonymous element; thats not a usecase.
> > > I can't even argue that the "foo:bar" is or is not a loss if an
> > > implementation
> > > ignores it since you don't give a realistic usecase to argue from.
> >
> > But that's the important point.  From the perspective of an general 
ODF
> > consumer, say a word processor, these private extensions are opaque,
> > without any discernible semantics, just like foo:bar.
> 
> yes, thats the point of adding data that is not in the ODF spec. Its 
private 
> data required by the implementation to not loose information.
> 
> Can you give me any usecase where any type of extention would be 
> useful to any 
> implementation that is not the one that wrote it?
> Or, in other words; what advantage does it give us to move this private 
data 
> to pre-defined extention positions?
> 

Certainly if we define extensions in a way that consuming applications can 
do nothing but throw out the extension data, then the use of extensions 
will be limited to rendering hints and other low-value, throw-away uses.

But what we lose in this case is the ability to have extension data that 
can be round-tripped, i.e., is preserved in a document even when the 
document is edited and re-saved.  If the extension data has any value to 
the author at all, I expect that he would express at least some 
dissatisfaction if such extension data is automatically lost if the 
document is edited in another application. 


> Do note that we already *have* pre-defined extension positions, 
specifically 
> metadata. So this is not about adding some metadata that I'd like to 
survive 
> a simple load/save by a random ODF implementation. Thats a solved 
problem.
> 
> > > I'd rather call a real usecase where things go really wrong 'proof' 
;)
> >
> > The entire point of my criticism is that the consumer of an extended
> > document just sees arbitrary XML.  It has no knowledge of use cases, 
of
> > what that extension is for.  It just sees foo:bar.  It is a black box.
> 
> Yes, and there is nothing you can do to open that black box. If 
> KOffice or Qts 
> ODFWriter decides it needs to store some data to make saving loading to 
its 
> native format not loose info, and that information doesn't fit in any 
ODF 
> tag, then you get a black box.

In some sense even a "black box" has a degree of interoperability, since I 
know how to copy and move a box.  It is as interoperable as any box.  But 
what we have with general foreign markup extensions is more like a "black 
haze".   The consumer doesn't know how to interpret, which is fine, but 
they won't know hot to copy it, move it, etc.

> Thats a fact of life. Arguing thats a bad idea is equivalent to 
> saying we need 
> to document each and every possible piece of data in the ODF 
specification. 
> And I think we don't want that. Do we?
> 

I think of it this way.  We're not putting binary blob extensions into the 
document, arbitrary bits in arbitrary places.  We use XML, and XML defines 
the basic level of syntax and structure for the extensions.  If we require 
a schema definition for the extension data, then additional constraints 
related to data type, ranges, cardinality, etc. are defined.  But in an 
editing envrionment, we have additional concerns that are not covered by 
RNG or XML Schema, things like whether the data is volatile or not, 
whether it can be copied, whether data can be changed without invalidating 
the extension data, etc.

A possible solution:

1) If we allow general, arbitrary extensions, then we need a vocabulary 
for allowing the extensions to declare how the data should be processed, 
specifically whether it should be preserved under copies, edits, etc. 
Maybe the default is that it is never preserved until declared otherwise.

2)If we find that there are a small number of patterns of extension, then 
we could make a conformance statement specifically for them.  For example, 
I bet a large number of extensions will fit into the pattern of simple 
namespaced attribute/value pairs which decorate content elements and 
travel with the underlying content when copied or moved, and which may 
safely be preserved when the underlying content is edited.

In the end, I want to avoid extension soup.  I acknowledge that there are 
good and useful extensions, but in practice they will not be all that 
useful, to users, if all applications just throw the extension data out. 
I'd like to see if we can standardize some basic "rules of the road" so 
they can be preserved when edited.

-Rob
References:
- Re: [office] The Rule of Least Power
  - From: Thomas Zander <T.Zander@nokia.com>
- Re: [office] The Rule of Least Power
  - From: robert_weir@us.ibm.com
- Re: [office] The Rule of Least Power
  - From: Thomas Zander <T.Zander@nokia.com>