[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Document Design and metadata
Savante, I have been thinking about your position that there should be no indirection of metadata by having metadata on metadata in a document and think I can articulate the difference in our positions. See what you think about the following: The elimination of indirection of metadata, that is that any metadata is applied directly to the content of the document (and by implication not to metadata) is a document design issue. That is to say that from the standpoint of designing the document format that whatever metadata is to be applied, it should be applied by appropriate mechanisms to the content of the document. While attractive and perhaps even good design, ;-), that works if and only if my only concern is designing of the "best" document format. Unfortunately, we know in fact that our document design is going to have to accomodate converted documents that followed less than the "best" design and even when they did, users will have in fact abused those designs. So to me the issue is whether the no indirection position can accomodate the conversion of other document designs and/or user abuse of those designs. For example, in our last call we discussed the need to have metadata on metadata that could change the processing of and ODF document. Take sections, which I understand cannot be embedded in other sections in OOXML (disclaimer, I have not verified that statement for myself). In order to maintain a roundtrip, I have to communicate to the ODF application that it cannot embed sections within sections. One way to do that would be to add metadata to the section style representing that prohibition. Actually as I typed that example it occurred to me that such metadata would be better attached to the document itself to prevent the creation of other sections that would embed in sections. (Whether than can be generalized for all other round tripping questions I don't know.) Hmmm, well, take a user abuse senario. When I use "italic" style, it is solely to mark foreign words in a text. Other people use "italic" for emphasis on the introduction of a new term in the document. Downstream, that is outside of the author's control, how are we to apply that directly as metadata to the content? (Assuming such a mechanism is in fact available?) If I can add metadata to styles, I could mark my "italic" style as being applied only to foreign words, which would allow any spell checker to simply skip those words. The same question comes up with Rob's example of embedded quotations where we want the spell checker to skip those quotations. I suppose you could say that we can extend styles to carry with them information about invocation of the spell checker, so that there is no indirection but that is only one example and I am not sure that we would be able to account for all the cases. BTW, does it matter who applied the style? Say for example that I apply a style that "hides" part of the document. Is it metadata about the style as to who applied it? Summary, note that I don't object to the notion that avoiding indirection is a good design principle but my concern is that there are lots of documents that have not followed good design and/or have been abused by users. If, for example, the any markup carried with it a "source" attribute, which indicates the origin of the markup, say OOXML, then an ODF application that was to be used for round-tripping could enforce the appropriate behavior to insure the document could be round tripped. (And no, I don't think we can cherry pick the appropriate elements for such an attribute. Simply not possible to know all the possible combinations. Applications could simply choose to ignore the attribute.) Hopefully this captures some of the issues underlying the question of indirection of metadata. Hope you are at the start of a great week! Patrick -- Patrick Durusau Patrick@Durusau.net Chair, V1 - Text Processing: Office and Publishing Systems Interface Co-Editor, ISO 13250, Topic Maps -- Reference Model Member, Text Encoding Initiative Board of Directors, 2003-2005 Topic Maps: Human, not artificial, intelligence at work!
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]