[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [dita] History Question: Why does <data> not include <cite>?
On 9/4/09 3:41 PM, "Erik Hennum" <ehennum@us.ibm.com> wrote: > The distinction would be academic without the fallback to general > processing. If the disclaimer specializes a footnote on the body, it will > get formatted by default. By contrast, if it specializes <data> in the > prolog, it will get ignored silently by default (because there's no known > processing for a property with no semantics). > > Anyway, that's my understanding of the TC's original consensus around the > <data> element, which of course may have evolved (toward the fittest). I think that what I have presented is a problem that has several aspects that happen to impinge on both how the DITA architects have to date modeled or thought about metadata in general and what it means for something to be metadata. Erik's comment about the disclaimer being a footnote on the body is really to the point: the disclaimer is, conceptually, a footnote on the *topic* as a whole, rather than being a footnote on the body or any particular part of the body. I have seen this requirement expressed in other contexts by using footnotes in the *title*, where the title may reflect some cited source or it is simply the only available representation of the topic as a whole in the original source format (where there is no useful notion of metadata as we are talking about it here). That is, what I have in this case is really footnote-type content, that is, an explanation of a related source. As a footnote, it really needs to allow any content. [And maybe the real solution is to allow <fn> within <metadata>, just as we allow <index-item>.] Putting a footnote in the title would not be a good solution, because it's not expressing the true relationship between the topic and the disclaimer, and putting it in the body would not be a good solution, for the same reason. And in my specific case, the default presentation in both of those cases is not the desired result. In fact, there is *no* default presentation provided by the current DITA spec or existing processors that can give me what I want because the presentation rules for disclaimer in this case are that it follow all content of the topic (including nested topics), not just the body. But as it happens, I have to implement my own processing anyway because I'm generating InCopy articles from the DITA XML and they have to reflect a specific set of editorial and organizational rules. I already have custom processing to synthesize both an "Author Bio" and "byline" from the author metadata in the article, so doing something similar for disclaimer (and by extension, HTML) is not a big deal in this case. Another aspect is the details of how the disclaimer is captured as metadata: Erik says, correctly, that metadata *tends* to be discrete data elements, and in this case I *could* capture the variant parts of the disclaimer as discrete values. However, I didn't in this case because there is no other business justification for doing so for the disclaimer (as compared to the author information, where it's clearly worth capturing more atomically) and requiring it would add unnecessary complication and another point of failure to a system that already depends on correct use of hard-to-validate word processing styles. Or said another way, if I had stepped up to designing the metadata markup to capture the salient bits of the disclaimer I would not have had the issue I ran into (no <cite> as a child of <data>). But I am contending that the DITA standard, at its most general, shouldn't *require* me to go to that level of effort as a cost of entry. That is, at its most general, it is inappropriate for the DITA spec to make a policy decision about what is and isn't appropriate for identification as metadata, even when an unconstrained use of that design might leave room for authors to color outside the lines. I think some of this tension is a side effect of DITA not having the 1.2 constraint mechanism, which largely allows us to eat our cake and have it, by allowing the most general design to be completely unconstrained but making it both possible and easy for specializations to apply constraints as they see fit. At least in the context of what I'm doing here, which is an applications of the topic types I've designed for the DITA For Publishers project, I would certainly not object to the concept, task, and reference topic types continuing to impose the current constraints on <data> and even going further to document that certain element types should not be used as descendants of <data> even though they are allowed (and those rules could be formalized as Schematron rules or XSD 1.1 assertions). But I would want <data> within unconstrained <topic> to at least allow all phrases, if not all body content (although I could certainly buy the argument that if you're at the point of putting complex body elements in a <data> context, you should really have a separate topic related by <data-about> or a relationship table). Also, if you see metadata as mostly about *retrieval*, rather than about processing, it's hard to see any harm in having more markup, rather than less, in metadata values, since they serve to optimize searching. Cheers, E. ---- Eliot Kimber | Senior Solutions Architect | Really Strategies, Inc. email: ekimber@reallysi.com <mailto:ekimber@reallysi.com> office: 610.631.6770 | cell: 512.554.9368 2570 Boulevard of the Generals | Suite 213 | Audubon, PA 19403 www.reallysi.com <http://www.reallysi.com> | http://blog.reallysi.com <http://blog.reallysi.com> | www.rsuitecms.com <http://www.rsuitecms.com>
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]