Re: [office] Preservation question

On 7/3/07, Patrick Durusau <patrick@durusau.net> wrote:

Marbux,

marbux wrote:
>
>
> So thus far, we have "SHALL preserve unless destroyed through
> user-initiated action." Any other use cases that require exceptions?
>
And how do you define "user-initiated action?" Does changing the content
of the element count?

If the user makes the choice to do so, yes. If the program makes the choice to do so, no.

What if I delete the content of a paragraph and
insert an image, all without disturbing the xml:id? Does saving the file
count as "user-initiated action?"

No, assuming the user has not deleted the relevant element in the process, which is what I think you're getting at.

Remembering that some implementations
don't use ODF as a processing format but only for interchange.

Or were you planning on requiring the use of ODF as a processing format?
Now that would be a radical change in the current standard.

No.

All the current standard requires is that an application use the
relationships and behaviors as defined by the standard for a document
that it encounters and when it writes out a document that it use those
relationships and behaviors to construct its files. What happens in
between is not specified.
>

Thank you for conceding that the standard does define application behavior. Now what application behavior do we need to require to achieve round-trip interoperability, e.g., between Google Docs and OOo?

>     The problem with specifying preservation in general is that it
>     means you
>     have to define what that means for processing models, which isn't
>     something that ODF has ever done. It defines a document format,
>     not how
>     you process it.
>

But you just said we do define application behavior. Am I wrong that application behavior is accomplished by its processor?

>
> I think that's an overstatement. E.g., the lists amendment seems to
> give fairly explicit processing instructions. But even were I wrong
> about that, the fact that something hasn't been done before is not a
> particularly strong argument for not doing it when needed. If it were,
> we'd still be living in trees and swinging on vines. :-)
>
>     Noting that defining an environment in which an editor exists is
>     fundamentally different from that of a regular XML processor. A
>     schema
>     or DTD can define xml:ids and it is straightforward to give the rules
>     for xml parser because it is never going to change or edit the text.
>     That is not the case with an editor. The reason why editors exist
>     is to
>     change the text and to do so, they have to have a certain amount of
>     flexibility.
>
>
> Agreed. But the question at hand is not whether they should have
> flexibility but how much?
>
>
>     If you want to offer some constructive suggestions on how to deal with
>     this issue I am sure everyone would be interested. However, jumping up
>     and down and saying that xml:ids must be preserved isn't helpful. It
>     really doesn't matter how much you want that to be the case if you
>     can't
>     offer any reasonable way for a standard to require it.
>
>
> I've been offering suggestions. E.g., why won't "SHALL ... unless"
> work? Or the approach of hanging <preserve> attributes on those that
> require preservation for interoperability purposes?
>
Simply saying "shall" is not a suggestion. As I tried to illustrate
above the question is more complex that you seem to realize.
>

Of course it is complex. But is complexity a sufficient reason for not fulfilling a market requirement?

>     I would certainly prefer the preservation option but at this point I
>     can't see any useful way to make that a requirement without
>     specifying
>     how ODF processors must work. And so far as I can tell, that is
>     not the
>     purpose of the ODF standard.
>
>
> So no way to require round trip interoperability, eh?
>
There is round trip interoperability for matters specified by ODF. That
is paragraphs, styles, etc.

That is if you construct an ODF document as specified by the ODF
standard, it will be read by another ODF application of similar
capabilities.

Only by 1:1 feature match and perfect mapping, eh? Then I presume you would support declared interoperability subsets and compatibility modes in implementing applications. Or are you simply indifferent to the inability of less featureful apps to round-trip documents with the more featureful apps?

What you seem to want is interoperability using arbitrary content
between two or more applications. So far as I know, that has never been
a goal of ODF, nor do I think it is possible.
>

Slightly different. I want *round-trip* interoperability and am far less concerned with whether documents render the same at both ends. I want the less featureful apps to be able to round trip documents with the more featureful apps and I want ODF applications to be able to participate in Microsoft-bound business processes with high fidelity conversions. See e.g., <http://en.wikipedia.org/wiki/Business_process_interoperability>. I also want a relatively painless, non-lossy migration path from MS Office and other nonconformant systems to ODF applications.

So, you would force all ODF based applications to preserve unknown and
unknowable content in the name of interoperability?

Precisely.

If you send your file to my ODF application which doesn't understand
your content, how is that interoperability? My application can't read
it, can't change it, etc. How is that interoperability?
>

Why can't your application read it or change it? It may not be able to render all formatting, but the critical factor is that when your revised document is sent back to to my application, the metadata and processing instructions my app needs are still there.

For example, WordPerfect uses something very like the foreign elements and attributes, called <unknown> tags, to maintain backward **and forward** compatibility between all WP versions from 6.x to the current version, WP 12. The WPD format has become much more featureful over that period. But when a WPD generated in WP 12 is opened in WordPerfect 6, all unrecognized markup is enclosed in <unknown> tags and any content enclosed by the unrecognized markup is rendered as text. After editing in WordPerfect 6 and reopened in e.g., WP 9, the processor examines the markup enclosed by <unknown> tags and removes the unknown tags surrounding all markup that is recognized so that it can be rendered as closely to the formatting originally inserted by WP 12. And if the document wends its way back to WP 12, all original formatting will be intact, other than that which was removed by edits in the other apps.

And I have a lot of personal experience with the WordPerfect <unknown> tags, batting legal documents back and forth with people who ran different versions of WordPerfect. I never encountered a problem with them.

That is what the foreign elements and attributes are all about: round-tripping documents even if one app doesn't recognize the markup inserted by another app. It isn't rocket science, as they say. It's a well understood method for achieving round-trip interoperability among apps with varying feature sets.

Applications should not be allowed to destroy markup they don't recognize. It destroys the ability to round-trip documents.

RFC 2119 does not appear in the current version of ODF so is really
irrelevant to this discussion.
>

That doesn't mean that its concepts could not be resurrected.

> Does any of that help?
>
>
> I think it clarifies that you believe there is no way to *require*
> application interoperability in the ODF specification without
> specifying how ODF processors must work, that you oppose specifying
> how ODF processors must work for interoperability purposes, and that
> you oppose the standard including any mandatory tools designed for
> *any* conversion strategy. Is that a fair summing up of your position?
>
Let me say it back to you:

It is not possible to have interoperability based upon the premise that
all ODF applications must accept unknown content. To use unknown
content, ODF would have to specify how to handle unknown content, since
unknown content that simply lies there really isn't useful to the
application.

That it isn't useful to one application does not mean that it is not useful to the next application in the processing chain.

And no, standards never include mandatory tools, but
particularly ones that have failed in the marketplace. FYI, the notion
of a tool designed for *any* conversion strategy is simply a fantasy.
There are pattern matching languages that in theory can convert from any
arbitrary format to another but those aren't tools in the traditional
sense of the word.

See the WordPerfect example described above.

> In the months I have participated on this TC, I have seen three
> methods of achieving round-trip interoperability discussed:
>
> * 1:1 feature match between applications with accurate mapping between
> them.
> * Declared interoperability subsets with implementing compatibility
> modes in applications (a variant of the above method).
> * Preservation of meta information required for interoperability by
> applications whether they support features or not.
>
> By default, we have only the first method, which means in practical
> terms that we have in reality a de facto standard for the most
> featureful application disguised as a de jure standard. Long Live King
> Michael. ODF is little different from OOXML in that regard.
>
Wait! You forgot the forth option offered by the Spoiler Party:

* Preserve foreign elements and attributes for the use of third part
mediator applications to map between formats.

Nope. That's just a variant of the third method I described above.

Which is the reason for wanting the "shall" on preserving foreign
elements that are not used by any defined application. To support the
Spoiler Party mediator.

> So here is a use case for you to solve:
>
> Sally Secretary uses Brand X ODF editor. She receives a file from Suzy
> Supervisor who uses Brand Y ODF editor. Sally wants to make some edits
> and send it to lawyer Sleazy Sam, who uses Brand Z ODF editor. Sam
> wants to make some edits and send the document to Suzy for printing
> out and signing. All three don't want to have to worry about whether
> data will be lost during the three-way round trip.
>
> What is *your* solution for Sally, Suzy, and Sam?

Use ODF. Assuming that the applications implement the same features.

How realistic is that solution? What it means in practical terms is that everyone has to use StarOffice/OOo until two more apps achieve feature parity with them. I wouldn't call that a standard designed for intoperability, as ECIS/Sun/IBM claim. "Open" and "interoperable" are not synonyms.

Noting that ODF does not require a particular set of features be
supported in order to allow for a variety of ODF based applications. But
that has nothing to do with preservation of foreign elements and
attributes or even xml:ids.

There's your blind spot.

I may have an application that doesn't support drawing, formulas, etc.,
but only adding metadata to a document. That can still be an ODF
conforming application. But I really should not be using it if someone
sends me a document to read and print.

I don't see the relevance.

The key phrase in the EU document that you overlooked was "must be
possible without technical barriers." That does not mean that all
applications have to implement the same set of features but merely that
there is no "technical barrier" to my having an application with those
abilities.

I did not overlook that phrase at all. If you allow destruction of unrecognized meta information and foreign elements and attributes, you create the technical barrier.

I think your blind spot, Patrick, is that you do not realize that round-trip interoperability without feature parity is feasible. You might consider checking in with Phil Boutros and ask him to explain what the foreign elements and attributes are for and also ask for his opinion on applications that destroy them. Then you might get a glimmer of what is really at stake with the xml:id attribute preservation issue.

office message