dita message

Subject: Re: [dita] Foreign Generalization: Should be moved to a non-normativeappendix

From: Eliot Kimber <ekimber@reallysi.com>
To: Michael Priestley <mpriestl@ca.ibm.com>
Date: Wed, 02 Dec 2009 15:19:07 -0600

On 12/2/09 2:16 PM, "Michael Priestley" <mpriestl@ca.ibm.com> wrote:

> Eliot Kimber <ekimber@reallysi.com> wrote on 12/02/2009 03:04:03 PM:
>> There are three ways content can be protected: move it to a side file,
>> escape all markup characters, or encapsulate it in a CDATA marked
> section.
>> The second two are obvious and would be to any implementor. The first is
> not
>> obvious and therefore we definitely need to say that it is *allowed* by
> the
>> standard.
> 
> If we allow three ways in the standard, then all three ways are required
> to be supported by respecialization processes. And I do think we would
> need to indicate how a process should interpret the content. For example -
> if a respecializer encounters inline content that is neither CDATA, nor
> sidefiled, assume markup has been escaped, and unescape it? That doesn't
> seem a terribly safe assumption to me, but if we want to open that door we
> will need to explore all the places it leads.

Hmm. I see your point with regard to respecialization--once escaped, you
would not be justified in unescaping content prior to parsing the document
since it may have always been escaped.

Also, I was mistaken in my assertion that <foreign> can contain untagged
text content--it can't (I had forgotten that ANY content models are still
element only). So even if you have a data format that is not itself XML,
you'd still have to define a new element type to hold it.

So that does mean that simple syntactic escaping would *not* work. Hmph.

Which means that Michael is correct that moving the foreign content entirely
out of the result document is the only way to ensure validity of the
transformed result.

So I think that leaves us with saying that "when you are transforming to a
DITA target where DTD validation is required and you cannot ensure that the
foreign content will be valid in the target DTD, the only way to ensure
validity is to move the foreign content to a side file." (As well as talking
about the <object> @data "DITA-foreign" convention.)

Note that this is *not* a "must" in the conformance sense, it is a
requirement imposed by the constraints of ANY content models. Thus it is a
"must" in the "there is only one way things can work" sense.

I'm thinking that maybe we should have in the DITA Processing section a
general "DITA-to-DITA transformation" topic under which we can then discuss
the general case (transformation that doesn't necessarily involve
generalization), generalization, and the handling of foreign elements.

The more I look at the Generalization topic, the more I think it doesn't
belong in the Specialization section.

Note also that the current description of the copy-to-a-sidefile process is
underspecified in that it doesn't say what the XML requirements on the side
file are, in particular, should the side file be an XML document with a
DOCTYPE or just a literal copy of the unparsed content of the original
<foreign> element? If it's no a literal copy, what rules, if any, should be
imposed on how the copied XML is serialized if there is no DTD? What are the
rules for reconstituting a side file when respecializing?

Cheers,

Eliot

-- 
Eliot Kimber
Senior Solutions Architect
"Bringing Strategy, Content, and Technology Together"
Main: 610.631.6770
www.reallysi.com
www.rsuitecms.com

Follow-Ups:
- RE: [dita] Foreign Generalization: Should be moved to a non-normative appendix
  - From: "Grosso, Paul" <pgrosso@ptc.com>

References:
- Re: [dita] Foreign Generalization: Should be moved to a non-normativeappendix
  - From: Michael Priestley <mpriestl@ca.ibm.com>