office message

Subject: Re: [office] Formal Request: ODF 1.2 Document Processing Model Proposal
From: Patrick Durusau <patrick@durusau.net>
To: dennis.hamilton@acm.org
Date: Thu, 11 Dec 2008 15:56:02 -0500
Dennis,

Dennis E. Hamilton wrote:
> Patrick,
>
> There are already tacit indications of processing assumptions wherever the specification mentions user interaction or suggests behaviors (e.g., default properties recorded for use when a new table row is introduced).  The problem is that these are not grounded in anything.  (Maybe we should remove them.  That is valuable to discuss.)
>
>   
Ah, well, yes, they are grounded, just not in the explicit text of ODF 
1.2. ;-)

Sorry!

One term we could use would be text model since what I suspect most of 
the semantics we define (at least in the presentation areas) are based 
on an implicit model of texts.

For example, take footnotes and endnotes. While it is true that we 
define styles that define styles for the placement of such notes, we are 
really relying upon an implicit notion of what we mean by "footnote" and 
"endnote."

Even though I can point to various attributes and styles that total up 
to a pretty fair definition of either one, such as a separator line, 
numbering that matches a location in the main text, location (bottom of 
page, end of section, end of document), etc., that is only because I 
know where to look and can piece together such a definition if pressed 
by someone who wants one.

Having said all that, I have been at the "text" (as distinguished from 
"document" modeling in the more limited markup sense) modeling game for 
a long time and while I can see real value in reaching common 
understandings of some terms and perhaps even defining a handful of 
them, I really don't think the pursuit of a solid model for texts is a 
useful enterprise. The more we press for something definite in one part, 
the more likely something will poke out on the other side, whether we 
see it or not.
> Processing Model might be the wrong term.  I was looking for a single noun phrase.  Off-and, I don't think I would be adverse to Document Model and Semantics, but I am wary of confusion with other uses of Document Model in our field. That would be something to hash out.  
>
>   
My suggestion is "text model."
> However, it is clear that certain decorations that are provided in a document structure are specifically intended to guide particular kinds of processing behavior -- that is in their semantics and sometimes it is their only semantics.  This raises conformance issues and also implications of what kinds of processing are being presumed.  I think some sort of explicit treatment of that is called for.  I don't mean to presume that ODF should impose a processing model (or a DOM), but that classes of processing scenario might need to be recognized to identify what the particular markup is generally directed toward.  Some nomenclature normalization around this may also be essential for conformance definitions that follow the current OASIS model.
>
>   
Well, but I see conformance, particularly with semantics as being 
incremental at best. There are some bright line areas, such as font 
size, which has a commonly accepted definition. There are a lot of gray 
areas as well.

Take our use of fo:widow for example. Sure we call all count lines, that 
looks like a bright line test doesn't it? But, recall that we don't 
define a page geometry or word spacing, line breaking, etc. algorithms. 
My bright line just got a lot fuzzier.

Realize that I don't disagree with making semantics more explicit. The 
only thing that has occupied more time than simply restructuring the 
text (no small task) has been to keep asking when the semantics were 
unclear to me. And to keep asking until I could understand what was 
being said. The Sun team and others have been very patient with my 
questions.

I suppose my caution is to start such a quest with the understanding 
that we really don't have a firm grasp on the semantics of texts and 
that the more we talk about what understandings we do have, the more 
precise they will become. But it is always an iterative process and not 
one that ever finishes. I would like to think that ODF 1.2 is going to 
be another step towards greater clarity in some semantics but realize 
that it will have probably made other worse. Perhaps greater clarity is 
too bold a claim, I would be happy with a different clarity! ;-)

> I am not sure what I am doing with regard to your (1-2) which is why it is vague.  That is intentional in the current sketch.  It may be too much to incorporate such a model in full-fledged form in ODF 1.2, but I do believe that (good term, thanks) a heuristic should be adopted for our being consistent in the specification.  I was thinking of that as a workable minimum.
>
> I also think that a nomenclature section of the ISO variety is called for either way.  And we need to work with careful definitions and use the terms consistently (e.g., using "XML document" when we are referring to any root-element subdocument of an ODF document structure, as distinct from the ODF document [structure] as a whole, with its variety of other parts and their packagings).
>   
Oooooh! Bite you tongue! ;-)

I really dislike nomenclature clauses, which I note are optional in the 
ISO Directives.

The reason is we have to *repeat* the definitions that already occur 
elsewhere in the standard and those definitions in the nomenclature 
clause *appear without context,* which can make defining some of them 
quite difficult.

Violates the first rule: Never repeat a definition (because it appears 
once in the nomenclature clause and where we use it).

Violates the second rule: Never define a definition differently. That is 
just fraught with peril for mistakes.

Perhaps an unfair example:

I want to define "separator."

Hmmm, how about: "A separator is a character that is displayed in lieu 
of a line number"

Relying on: "A separator is text that is displayed instead of a line 
number for lines where no number is displayed." 
<text:linenumbering-separator>

Yeah, but then the style folks say: "But <style:column-sep> says a 
separator is a line between columns."

The context in which I use the term is critical to its definition.

Granted you want to define far more general terms and that is why I said 
the example is unfair, but where do we draw the line?

My preference, subject to the wishes of the TC, is to define terms as we 
encounter them and then to consistently use them. I think that is the 
very strong point that you are making. I am hopeful that with everyone's 
assistance that we will have a high level of consistency in that regard.
> Finally, I agree there are lots of ways to "process" ODF document structures (I am becoming fond of that term), and I don't propose ruling any of them out.  However, there are clearly presumed scenarios around what it all means when the ODF document representation is turned into a perceivable document for human use in office document applications and that the interpretation is in conformance with the semantics for ODF documents.  The challenge is figuring out how any normative language about that occurs in the specification, if at all, and then what it means to say that some (class of processing) is implemented in a conformant way.  
>   
Sure, and that is the very hard part.
> I don't see how we can get by with no semantics at all for the markup (and I don't think you are suggesting that).  The ODF appeals to other standards by reference would seem to bring with them semantics from those specifications in any case.  Or maybe not. With the substitution of OASIS namespaces, I am not entirely clear what is incorporated and what is not. 
>
>   
No, I am not suggesting that markup has no semantics, just that we need 
to avoid thinking we have been overly precise with those semantics.

And true, we need to be as precise as possible when we import other 
elements what we think the semantics are that we are importing.
> I suppose that the OIC TC can go farther in this direction than the ODF specification might, but if there is no meaningful conformance in the ODF spec, it doesn't give anyone much to go on when it comes to assessing conformance of products, proposing ways to improve assurance of interchange and interoperable use, etc. 
>
>   
I think we need to look really closely at the work Michael has been 
doing on the conformance clause. I must confess that I have been 
concentrating on other matters but from what I have read he has taken us 
forward on the issues of conformance. As the standards evolves I think 
our understanding of it and what it means to conform to it will evolve 
as well.
> Thanks for your response.  I value this conversation with you.
>
>   
And I with you! It is a welcome break!

Hope you are having a great day!

Patrick
>  - Dennis
>
> -----Original Message-----
> From: Patrick Durusau [mailto:patrick@durusau.net] 
> http://lists.oasis-open.org/archives/office/200812/msg00090.html
> Sent: Thursday, December 11, 2008 00:52
> To: dennis.hamilton@acm.org
> Cc: ODF TC List
> Subject: Re: [office] Formal Request: ODF 1.2 Document Processing Model Proposal
>
> Dennis,
>
> My puppy woke me up (it is raining in Covington) so I decided to catch 
> the early email. ;-)
>
> This is an interesting proposal but I could not decide if you were 
> proposing:
>
> 1) A model and additional text about that model to be added to ODF 1.2 or
>
> 2) A model that would be used as a heuristic in evaluating the 
> completeness/incompleteness of ODF more generally?
>
> Moreover, I am not entirely sure that we need to go towards processing 
> models, although they are a critical step in the chain of events that 
> lead to a "document" in the sense of something that we view and share 
> with others.
>
> The reason why I make that last statement is that I prefer to think of 
> ODF as a format that stores information that obviously has an implied 
> model of a document, <text:h>, <text:p>, etc. but that does not require 
> a particular processing model for the information so recorded. That is 
> to say that I can certainly process an ODF document instance with XML 
> tools, or I can use tools that are completely innocent of XML in terms 
> of their processing of the document (a table based model, for example), 
> so long as when they serialize a result to be saved, they do so in the 
> correct ODF XML structure. And, of course, they honor the semantics that 
> have been defined for some content in the ODF document.
>
> Having said all that, obviously those sort of distinctions are easy to 
> say in the abstract and hard to enforce in concrete cases.
>
> Looking forward to hearing more about your proposal.
>
> Hope you are having a great day!
>
> Patrick
>
> Dennis E. Hamilton wrote:
> http://lists.oasis-open.org/archives/office/200812/msg00088.html
>   
>> I formally request consideration of the proposal "ODF 1.2 Document Processing Model" for ODF 1.2
>>
>> The new proposal document with an incomplete sketch is on the wiki at 
>> http://wiki.oasis-open.org/office/ODF_1.2_Document_Processing_Model
>>
>>     
> [ ... ]
>
>
>   

-- 
Patrick Durusau
patrick@durusau.net
Chair, V1 - US TAG to JTC 1/SC 34
Convener, JTC 1/SC 34/WG 3 (Topic Maps)
Editor, OpenDocument Format TC (OASIS), Project Editor ISO/IEC 26300
Co-Editor, ISO/IEC 13250-1, 13250-5 (Topic Maps)
Follow-Ups:
- RE: [office] Formal Request: ODF 1.2 Document Processing Model Proposal
  - From: "Dennis E. Hamilton" <dennis.hamilton@acm.org>
References:
- Formal Request: ODF 1.2 Document Processing Model Proposal
  - From: "Dennis E. Hamilton" <dennis.hamilton@acm.org>
- Re: [office] Formal Request: ODF 1.2 Document Processing Model Proposal
  - From: Patrick Durusau <patrick@durusau.net>
- RE: [office] Formal Request: ODF 1.2 Document Processing Model Proposal
  - From: "Dennis E. Hamilton" <dennis.hamilton@acm.org>