dita message
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
| [List Home]
Subject: Re: [dita] Foreign Generalization: Should be moved to a non-normativeappendix
- From: Michael Priestley <mpriestl@ca.ibm.com>
- To: Eliot Kimber <ekimber@reallysi.com>
- Date: Wed, 2 Dec 2009 15:16:57 -0500
Eliot Kimber <ekimber@reallysi.com> wrote on
12/02/2009 03:04:03 PM:
> There are three ways content can be protected: move it to a side file,
> escape all markup characters, or encapsulate it in a CDATA marked
section.
> The second two are obvious and would be to any implementor. The first
is not
> obvious and therefore we definitely need to say that it is *allowed*
by the
> standard.
If we allow three ways in the standard, then all three
ways are required to be supported by respecialization processes. And I
do think we would need to indicate how a process should interpret the content.
For example - if a respecializer encounters inline content that is neither
CDATA, nor sidefiled, assume markup has been escaped, and unescape it?
That doesn't seem a terribly safe assumption to me, but if we want to open
that door we will need to explore all the places it leads.
I wanted to require support for only one of those.
If we want to broaden the options, I think it creates more cost to implementers,
not fewer.
> I think if we refocused the current Foreign generalization topic on
> "protecting foreign content" when transforming to DITA documents
that use
> DTDs and then in the context of the "foreign-to-object"
technique mention
> that if you are doing generalization as part of the transform and
> respecialization is a requirement you should set the @data attribute
as
> indicated in the current spec. Also, this topic should be in the DITA
> Processing section, not in the specialization section, since it's
not really
> about specialization or generalization specifically.
The only case in which there is foreign content in
DITA is in the context of a foreign specialization. The only case in which
that content requires protection is during transformation to another DITA
format. The only case of DITA-to-DITA transformation that we prescribe
behavior for is generalization and respecialization.
That's why I continue to say this is a generalization
problem, and belongs where it is, with the current requirements.
>
> But that is the only part of this whole issue that specifically involves
> generalization. Otherwise this is a more general DITA-to-DITA transform
> issue.
>
> I'm not sure I understand MP's concern about the scenario "being
supported".
> This case *must* be handled by any transformation processor that produces
> validatable result documents. It doesn't matter whether we say anything
> about it or not. So there's no way that the scenario *can't* be
> supported--even if the spec was silent there would still be ways to
solve
> the problem.
Not automatically. A human could inspect the problem,
design a workaround, and write some code. That's not how generalization
is supposed to work. The goal is interoperable systems by default, not
by extra effort.
>
> With respect to how one may validly include non-DITA markup declarations,
> see my comments below.
Gotcha. It does at least say that it "hinders
interoperability", which is a mild understatement. At the very least,
I'd suggest that we make that an option which we strongly recommend against.
I'm not sure we can remove it without being backwards incompatible with
1.1, but we can make the existing warning stronger.
>
> Cheers,
>
> E.
>
>
> On 12/2/09 1:09 PM, "Michael Priestley" <mpriestl@ca.ibm.com>
wrote:
>
> > Eliot Kimber <ekimber@reallysi.com> wrote on 12/01/2009
10:57:48 PM:
>
> [...]
>
> >> I think it must always be the case that for specializations
of foreign
> > the
> >> module that defines the specialization must also include
the DTD
> >> declarations for any allowed non-DITA content.
> >>
> >> Therefore, if you are respecializing to that module, you
can integrate
> > the
> >> module into the effective DITA doctype and therefore will
have the
> > required
> >> declarations to restore any protected markup as elements
of the
> >> respecialized document instance. I don't see how that can
depend on the
> >> details of how the generalization was performed.
> >
> > If generalizers are not required to split out foreign markup
from the
> > document when creating a validatable generalization, then they
may use
> > some other technique, for example namespaces, which could work
for them if
> > they're using schemas but will break if the recipient is using
DTDs.
>
> The issue only exists for DTD-based validation: the XSDs already specify
> "skip" for the content of <foreign>, so you *never*
have to protect it when
> the output target uses XSDs for validation.
>
> For DTDS, namespaces don't help. There are only three ways to protect
the
> content and it is sufficient to enumerate them.
>
> [...]
>
> >> Here's a question: if <foreign> is not specialized
but you have
> > included, in
> >> your shell, declarations for non-DITA elements, the problem
still exists
> >> even though foreign itself isn't specialized (and therefore
doesn't need
> > to
> >> be generalized).
> >>
> >> Therefore, the issue isn't really an issue of generalization,
but an
> > issue
> >> of *any transformation* from one DITA document type to a
different DITA
> >> document type: there is *always* the potential that non-DITA
markup in
> >> <foreign> cannot be validated in the transformation
target.
> >
> > What you describe is a non-standard use of <foreign>. If
someone does
> > this, then they've no longer got valid DITA content according
to the spec.
>
> I'm not sure the current spec actually disallows this case. From the
2nd
> review draft under "Specializing foreign or unknown content":
>
> "There are three methods of incorporating foreign content into
DITA.
>
> - A domain specialization of the <foreign> or <unknown>
element. This is the
> usual implementation.
> - A structural specialization using the <foreign> or <unknown>
element. This
> affords more control over the content.
> - Do nothing: simply embed the foreign content within <foreign>
or
> <unknown>. Because of the ANY content model of these elements,
this method
> offers the least amount of control over the content and hinders
> interoperability.
>
> Note item 3: that is exactly the case I am referring to below.
>
> However, I agree that it probably *should not* be allowed. That is,
> unspecialized uses of <foreign> should only allow DITA elements
(e.g.,
> <desc>, <object>, etc.). This would require the creation
of modules and
> remove the problem of knowing where to get declarations for non-DITA
> elements.
>
> Either the third bullet above is correct and therefore it's implicitly
> allowed to include non-DITA declarations into a shell (because there's
no
> other way for you to do it) or including non-DITA declarations into
a shell
> is not allowed and therefore the third bullet is nonsense because
you can
> never have a valid case of no <foreign> specializations and
valid non-DITA
> elements.
>
> MP clearly thought the latter was the case and I am agreeing that
it
> *should* be the case but also asserting that the current spec as written
> doesn't say that.
>
> Cheers,
>
> E.
>
> > The intent of <foreign> specializations is to provide a
hook and name for
> > the module, which can then provide a specific content model.
The foreign
> > markup can then be included and assembled into a doctype like
any other
> > domain module. The specialized <foreign> element provides
a signal to
> > processors about where that foreign markup has been included
(through the
> > specialized element's class attribute).
> >
> > If you include random markup in your DITA document without a
specialized
> > element, then there is no way for a processor to tell whether
a specific
> > document contains markup from that domain. That means it cannot
be
> > reliably generalized to any valid ancestor.
> >
> >>
> >> Thus, the focus on generalization with respect to <foreign>
is a red
> >> herring, or rather, an instance of a more general problem.
> >>
> >> The solution could be made clearer by having a separate "foreign
markup
> >> domain" module that provides the declarations for non-DITA
elements and
> > is
> >> required to be declared in @domains. That would provide both
a
> > DITA-defined
> >> place to declare foreign elements when <foreign> is
not specialized and
> >> enable determination if the target doctype supports the same
set of
> > modules
> >> [you still wouldn't be able to tell what module a given non-DITA
element
> >> type belonged to unless we provided way to define the mapping
> > somewhere].
> >
> > What you're suggesting is already part of the spec, and is already
> > required. This is how <foreign> is meant to be used. And
you would be able
> > to tell what module a non-DITA element type belonged to, by inspecting
the
> > class attribute of its specialized container.
> >
> >>
> >> But without that, you can't know by DITA-defined means that
a target
> >> document supports any given non-DITA elements, so your default
behavior
> > has
> >> to be to protect such markup.
> >
> > Without that you don't have valid DITA, according to the spec.
So you are
> > protecting a broad range of behavior that is already disallowed.
We don't
> > allow people to randomly add markup to DITA and still call it
DITA. There
> > are procedures, to ensure interoperability and interchangeability.
> > Specializing <foreign> is one of those procedures.
> >
> >>
> >> We should definitely make the isomorphic relationship between
content
> > and
> >> <object> within <foreign> clearer in the spec--given
an explanation of
> > how
> >> the two are functionally equivalent, it becomes clearer that
doing that
> >> transform is one way to solve the unvalidatable foreign content
problem.
> >>
> >> Note also that in XSLT 2 it is trivial to parse the content
of a CDATA
> >> marked section. That was not the case with XSLT 1. Since
the Toolkit now
> >> supports XSLT 2 we could, for example, implement the processing
of
> >> CDATA-encapsulated <foreign> content in the 1.5 Toolkit.
> >>
> >> In fact, now that I think about it, you could even have a
complete XML
> >> document with its own DOCTYPE decl in a CDATA marked section,
e.g.:
> >>
> >> <foreign>
> >> <![CDATA[
> >> <?xml version="1.0"?>
> >> <!DOCTYPE mathml PUBLIC "whatever" "mathml.dtd">
> >> <mathml>
> >> ...
> >> </mathml>
> >> ]]>
> >> </foreign>
> >>
> >> The current standard doesn't disallow that (it couldn't)
but it also
> > doesn't
> >> indicate that processors should handle that case as though
the content
> > were
> >> a separate document entity. But that behavior is definitely
implicit in
> > the
> >> "it can be replaced with an <object> element"
statement.
> >>
> >> That is, if the content of <foreign> can be replaced
by an <object>
> > element
> >> that references the content, it follows that an object element
can be
> >> replaced by a foreign element that contains the content referenced
by
> > the
> >> <object> element.
> >
> > I'm not going to worry about this for 1.2. I just want to protect
the
> > behavior we defined as valid for 1.1.
> >
> >>
> >> Cheers,
> >>
> >> Eliot
> >>
> >> --
> >> Eliot Kimber
> >> Senior Solutions Architect
> >> "Bringing Strategy, Content, and Technology Together"
> >> Main: 610.631.6770
> >> www.reallysi.com
> >> www.rsuitecms.com
> >>
>
> --
> Eliot Kimber
> Senior Solutions Architect
> "Bringing Strategy, Content, and Technology Together"
> Main: 610.631.6770
> www.reallysi.com
> www.rsuitecms.com
>
>
> ---------------------------------------------------------------------
> To unsubscribe from this mail list, you must leave the OASIS TC that
> generates this mail. Follow this link to all your TCs in OASIS
at:
> https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php
>
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
| [List Home]