OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

dita message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [dita] Foreign Generalization: Should be moved to a non-normativeappendix


I had missed an important detail that I think is at the root of Michael's
concern, namely that in order for foreign-to-object transforms to be
reversible, you must set the @data attribute on the object to something that
signals the object is the result of such a transform. The current spec
specifies that it should be set to the value "DITA-foreign".

This is in the same category as @class value preservation and is definitely
needed to enable respecialization when this technique is used to protect
foreign content. The spec definitely needs to say that.

It was not my intent to lose that bit of detail in the process of making it
clear that the foreign-to-object technique is one of three possible ways to
protect foreign content in the result of DITA-to-DITA transforms.

There are three ways content can be protected: move it to a side file,
escape all markup characters, or encapsulate it in a CDATA marked section.
The second two are obvious and would be to any implementor. The first is not
obvious and therefore we definitely need to say that it is *allowed* by the
standard.

I think if we refocused the current Foreign generalization topic on
"protecting foreign content" when transforming to DITA documents that use
DTDs and then in the context of the "foreign-to-object" technique mention
that if you are doing generalization as part of the transform and
respecialization is a requirement you should set the @data attribute as
indicated in the current spec. Also, this topic should be in the DITA
Processing section, not in the specialization section, since it's not really
about specialization or generalization specifically.

But that is the only part of this whole issue that specifically involves
generalization. Otherwise this is a more general DITA-to-DITA transform
issue. 

I'm not sure I understand MP's concern about the scenario "being supported".
This case *must* be handled by any transformation processor that produces
validatable result documents. It doesn't matter whether we say anything
about it or not. So there's no way that the scenario *can't* be
supported--even if the spec was silent there would still be ways to solve
the problem.

With respect to how one may validly include non-DITA markup declarations,
see my comments below.

Cheers,

E.


On 12/2/09 1:09 PM, "Michael Priestley" <mpriestl@ca.ibm.com> wrote:

> Eliot Kimber <ekimber@reallysi.com> wrote on 12/01/2009 10:57:48 PM:

[...]

>> I think it must always be the case that for specializations of foreign
> the
>> module that defines the specialization must also include the DTD
>> declarations for any allowed non-DITA content.
>> 
>> Therefore, if you are respecializing to that module, you can integrate
> the
>> module into the effective DITA doctype and therefore will have the
> required
>> declarations to restore any protected markup as elements of the
>> respecialized document instance. I don't see how that can depend on the
>> details of how the generalization was performed.
> 
> If generalizers are not required to split out foreign markup from the
> document when creating a validatable generalization, then they may use
> some other technique, for example namespaces, which could work for them if
> they're using schemas but will break if the recipient is using DTDs.

The issue only exists for DTD-based validation: the XSDs already specify
"skip" for the content of <foreign>, so you *never* have to protect it when
the output target uses XSDs for validation.

For DTDS, namespaces don't help. There are only three ways to protect the
content and it is sufficient to enumerate them.

[...] 
 
>> Here's a question: if <foreign> is not specialized but you have
> included, in
>> your shell, declarations for non-DITA elements, the problem still exists
>> even though foreign itself isn't specialized (and therefore doesn't need
> to
>> be generalized).
>> 
>> Therefore, the issue isn't really an issue of generalization, but an
> issue
>> of *any transformation* from one DITA document type to a different DITA
>> document type: there is *always* the potential that non-DITA markup in
>> <foreign> cannot be validated in the transformation target.
> 
> What you describe is a non-standard use of <foreign>. If someone does
> this, then they've no longer got valid DITA content according to the spec.

I'm not sure the current spec actually disallows this case. From the 2nd
review draft under "Specializing foreign or unknown content":

"There are three methods of incorporating foreign content into DITA.

- A domain specialization of the <foreign> or <unknown> element. This is the
usual implementation.
- A structural specialization using the <foreign> or <unknown> element. This
affords more control over the content.
- Do nothing: simply embed the foreign content within <foreign> or
<unknown>. Because of the ANY content model of these elements, this method
offers the least amount of control over the content and hinders
interoperability.

Note item 3: that is exactly the case I am referring to below.

However, I agree that it probably *should not* be allowed. That is,
unspecialized uses of <foreign> should only allow DITA elements (e.g.,
<desc>, <object>, etc.). This would require the creation of modules and
remove the problem of knowing where to get declarations for non-DITA
elements.

Either the third bullet above is correct and therefore it's implicitly
allowed to include non-DITA declarations into a shell (because there's no
other way for you to do it) or including non-DITA declarations into a shell
is not allowed and therefore the third bullet is nonsense because you can
never have a valid case of no <foreign> specializations and valid non-DITA
elements. 

MP clearly thought the latter was the case and I am agreeing that it
*should* be the case but also asserting that the current spec as written
doesn't say that.

Cheers,

E.
 
> The intent of <foreign> specializations is to provide a hook and name for
> the module, which can then provide a specific content model. The foreign
> markup can then be included and assembled into a doctype like any other
> domain module. The specialized <foreign> element provides a signal to
> processors about where that foreign markup has been included (through the
> specialized element's class attribute).
> 
> If you include random markup in your DITA document without a specialized
> element, then there is no way for a processor to tell whether a specific
> document contains markup from that domain. That means it cannot be
> reliably generalized to any valid ancestor.
> 
>> 
>> Thus, the focus on generalization with respect to <foreign> is a red
>> herring, or rather, an instance of a more general problem.
>> 
>> The solution could be made clearer by having a separate "foreign markup
>> domain" module that provides the declarations for non-DITA elements and
> is
>> required to be declared in @domains. That would provide both a
> DITA-defined
>> place to declare foreign elements when <foreign> is not specialized and
>> enable determination if the target doctype supports the same set of
> modules
>> [you still wouldn't be able to tell what module a given non-DITA element
>> type belonged to unless we provided way to define the mapping
> somewhere].
> 
> What you're suggesting is already part of the spec, and is already
> required. This is how <foreign> is meant to be used. And you would be able
> to tell what module a non-DITA element type belonged to, by inspecting the
> class attribute of its specialized container.
> 
>> 
>> But without that, you can't know by DITA-defined means that a target
>> document supports any given non-DITA elements, so your default behavior
> has
>> to be to protect such markup.
> 
> Without that you don't have valid DITA, according to the spec. So you are
> protecting a broad range of behavior that is already disallowed. We don't
> allow people to randomly add markup to DITA and still call it DITA. There
> are procedures, to ensure interoperability and interchangeability.
> Specializing <foreign> is one of those procedures.
> 
>> 
>> We should definitely make the isomorphic relationship between content
> and
>> <object> within <foreign> clearer in the spec--given an explanation of
> how
>> the two are functionally equivalent, it becomes clearer that doing that
>> transform is one way to solve the unvalidatable foreign content problem.
>> 
>> Note also that in XSLT 2 it is trivial to parse the content of a CDATA
>> marked section. That was not the case with XSLT 1. Since the Toolkit now
>> supports XSLT 2 we could, for example, implement the processing of
>> CDATA-encapsulated <foreign> content in the 1.5 Toolkit.
>> 
>> In fact, now that I think about it, you could even have a complete XML
>> document with its own DOCTYPE decl in a CDATA marked section, e.g.:
>> 
>> <foreign>
>>   <![CDATA[
>> <?xml version="1.0"?>
>> <!DOCTYPE mathml PUBLIC "whatever" "mathml.dtd">
>> <mathml>
>>   ...
>> </mathml>
>> ]]>
>> </foreign>
>> 
>> The current standard doesn't disallow that (it couldn't) but it also
> doesn't
>> indicate that processors should handle that case as though the content
> were
>> a separate document entity. But that behavior is definitely implicit in
> the
>> "it can be replaced with an <object> element" statement.
>> 
>> That is, if the content of <foreign> can be replaced by an <object>
> element
>> that references the content, it follows that an object element can be
>> replaced by a foreign element that contains the content referenced by
> the
>> <object> element.
> 
> I'm not going to worry about this for 1.2. I just want to protect the
> behavior we defined as valid for 1.1.
> 
>> 
>> Cheers,
>> 
>> Eliot 
>> 
>> -- 
>> Eliot Kimber
>> Senior Solutions Architect
>> "Bringing Strategy, Content, and Technology Together"
>> Main: 610.631.6770
>> www.reallysi.com
>> www.rsuitecms.com
>> 

-- 
Eliot Kimber
Senior Solutions Architect
"Bringing Strategy, Content, and Technology Together"
Main: 610.631.6770
www.reallysi.com
www.rsuitecms.com



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]