OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

docbook message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [docbook] XInclude problem


Hi,

Let's start with a simple example. Assume I have document A that uses an 
XInclude include element to include document B.
Let's assume also that document A has specified a DTD DA and document B 
has specified a DTD DB.

If you enable the XInclude processing in Xerces and validate document A 
then what happens is that document A is validated against the DA DTD and 
document B is validated against the DB DTD. That is the parser validates 
each document against its own DTD.
This is what the validation does.
If you look on the SAX or if you get the DOM after parsing the document 
you get a document with XInclude elements resolved, that is instead of 
the XInclude include element you get the content of B document.
Note that the validation is not performed on this but on the content of 
each document, each against its own specified DTD.
The transformation takes the SAX or DOM obtained after parsing the 
document, that is a document that instead of XInclude include elements 
contains the included content, that is why the transformation works as 
expected in all cases, it does not depend on Saxon or Xalan.

Now, as each document is validated against its own DTD without expanding 
the XInclude include elements you need to add these elements in the DTD. 
It depends on how you change the DTD to define that having an Xinclude 
include element in a certain place makes a valid document or not against 
that DTD.

Most people expect however that the document after resolving the 
XInclude include elements to be validated against the initial DTD, that 
is on the DTD without XInclude document. Xerces does not do that. In 
order to get this one can run the document through a copy XSLT 
stylesheet like below:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"; 
version="1.0">
     <xsl:output 
doctype-system="http://www.docbook.org/xml/4.4/docbookx.dtd";
         doctype-public="-//OASIS//DTD DocBook XML V4.4//EN"/>
     <xsl:template match="node() | @*">
         <xsl:copy>
             <xsl:apply-templates select="node() | @*"/>
         </xsl:copy>
     </xsl:template>
</xsl:stylesheet>

As a result you will get the document with XInclude elements replaced 
with the included content. That can be validated against the initial 
DocBook DTD which is already specified in the result.
(You will get a declaration of the XInclude namespace in the output, 
xmlns:xi="http://www.w3.org/2001/XInclude";, that needs to be removed 
before validating against the DocBook DTD - maybe with a little more 
work one can remove that as well).

Hope things are clear now, if not please let me know and I will go into 
more details.

I promised to Bob to rise this problem on the Xerces list some time ago, 
as many people expect the parser to validate the document with XInclude 
resolved against the main document DTD but I did not got the time for 
that yet. Maybe if someone from Xerces is watching may give some more 
details about the possibility to validate the document with XInclude 
expanded against the document DTD.

Best Regards,
George
---------------------------------------------------------------------
George Cristian Bina
<oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger
http://www.oxygenxml.com


Zbyszek Cybulski wrote:
> George,
> 
> I don't understand your explanation. I accept that the trick with 
> xinclude.mod is necessary to enable parsing the DocBook documents with 
> XIncludes. I also understand the Oxygen team had to workaround the 
> Xerces weaknesses. And I don't understand the phrase:
> 
> "... is to pass the document through a copy stylesheet that eventually 
> adds also the DocBook doctype declaration and validate that result."
> 
> What do you mean? I thought the modified xinclude.mod shipped with 
> Oxygen was that workaround. What do I need to perform what you 
> suggested? Can you provide with the detailed procedure and additional 
> files (if necessary)?
> 
> Does it mean that Oxygen (or its components) only partially support 
> XIncludes? Or does it support XInclude only in certain cases? Can you 
> put more light on it, George?
> 
> Thanks,
> 
> Zbyszek
> 
> George Cristian Bina wrote:
> 
>> Hi,
>>
>> It is not a Saxon or Xalan problem, they are XSLT processors, the 
>> validation is done with an XML Parser which in this case is Xerces.
>>
>> The issue is not specific to oXygen, it has to do with the way Xerces 
>> 2.6.2 processes the DTD plus XInclude validation. Xerces validates the 
>> document *with XInclude elements* against the DTD. It does not do a 
>> validation of the document with the XInclude resolved against the DTD. 
>> That is why one needs the xinclude.mod module, to add the XInclude 
>> elements to the DTD.
>> In order to have the document valid one has to make the DTD accept the 
>> desired document structure. We took Bob's xinclude.mod and changed it 
>> a little to workaround some problems found in Xerces.
>> In order to validate the document with XInclude expanded against the 
>> full DocBook DTD (no XInclude customization) the current workaround is 
>> to pass the document through a copy stylesheet that eventually adds 
>> also the DocBook doctype declaration and validate that result.
>>
>> The transformation works with any of Xalan or Saxon as after parsing 
>> Xerces gives the document with XInclude resolved, but it performs 
>> validation on each document against its own DTD.
>>
>> Best Regards,
>> George
>> ---------------------------------------------------------------------
>> George Cristian Bina
>> <oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger
>> http://www.oxygenxml.com
>>
>>
>> Bob Stayton wrote:
>>
>>> Since the validation and transformation works for me using XInclude, 
>>> and it
>>> works for you using Saxon in Oxygen, I think that indicates the files
>>> themselves are ok.  I think you should submit this problem with Xalan 
>>> to the
>>> Oxygen support list.
>>>
>>> Bob Stayton
>>> Sagehill Enterprises
>>> DocBook Consulting
>>> bobs@sagehill.net
>>>
>>>
>>> ----- Original Message ----- From: "Zbyszek Cybulski" 
>>> <z.cybulski@gmail.com>
>>> To: "Bob Stayton" <bobs@sagehill.net>
>>> Cc: <docbook@lists.oasis-open.org>
>>> Sent: Wednesday, July 06, 2005 2:03 PM
>>> Subject: Re: [docbook] XInclude problem
>>>
>>>
>>> Bob,
>>>
>>> There are two errors while validating the master document.
>>> The first error is as follows:
>>> Unexpected element "xi:include". The content of the parent element
>>> type must match
>>> "(sectioninfo?,(title,subtitle?,titleabbrev?),(toc|lot|index|glossary|biblio 
>>>
>>> graphy)*,
>>> (((calloutlist|glosslist|itemizedlist|orderedlist|segmentedlist|simplelist|v 
>>>
>>> ariablelist|caution|important|note|tip|warning|literallayout|programlisting| 
>>>
>>> programlistingco|screen|screenco|screenshot|synopsis|cmdsynopsis|funcsynopsi 
>>>
>>> s|classsynopsis|fieldsynopsis|constructorsynopsis|destructorsynopsis|methods 
>>>
>>> ynopsis|formalpara|para|simpara|address|blockquote|graphic|graphicco|mediaob 
>>>
>>> ject|mediaobjectco|informalequation|informalexample|informalfigure|informalt 
>>>
>>> able|equation|example|figure|table|msgset|procedure|sidebar|qandaset|task|an 
>>>
>>> chor|bridgehead|remark|highlights|abstract|authorblurb|epigraph|indexterm|be 
>>>
>>> ginpage|xi:include)+,
>>> (refentry*|section*|simplesect*))|refentry+|section+|simplesect+),(toc|lot|i 
>>>
>>> ndex|glossary|bibliography)*)".
>>> This error refers to the line where the last XInclude is placed
>>> (reference to f3.xml).
>>>
>>> The next one:
>>> The content of element type "section" must match
>>> "(sectioninfo?,(title,subtitle?,titleabbrev?),
>>> (toc|lot|index|glossary|bibliography)*,
>>> (((calloutlist|glosslist|itemizedlist|orderedlist|segmentedlist|simplelist|v 
>>>
>>> ariablelist|caution|important|note|tip|warning|literallayout|programlisting| 
>>>
>>> programlistingco|screen|screenco|screenshot|synopsis|cmdsynopsis|funcsynopsi 
>>>
>>> s|classsynopsis|fieldsynopsis|constructorsynopsis|destructorsynopsis|methods 
>>>
>>> ynopsis|formalpara|para|simpara|address|blockquote|graphic|graphicco|mediaob 
>>>
>>> ject|mediaobjectco|informalequation|informalexample|informalfigure|informalt 
>>>
>>> able|equation|example|figure|table|msgset|procedure|sidebar|qandaset|task|an 
>>>
>>> chor|bridgehead|remark|highlights|abstract|authorblurb|epigraph|indexterm|be 
>>>
>>> ginpage|xi:include)+,
>>> (refentry*|section*|simplesect*))|refentry+|section+|simplesect+),
>>> (toc|lot|index|glossary|bibliography)*)".
>>> This refers to the last line of the master document, the closing section
>>> tag.
>>>
>>> I use Oxygen 5.1, Xalan or Saxon 6.5.3 (both shipped with Oxygen) as
>>> the XSLT processor, on Windows. What's interesting, the validation
>>> using Xalan fails while the transformation works perfectly (the same
>>> for the Saxon validation).
>>>
>>> Master document is as follows:
>>> <?xml version="1.0" encoding="UTF-8"?>
>>> <!DOCTYPE section PUBLIC "-//OASIS//DTD DocBook XML V4.3//EN"
>>>                          "http://www.docbook.org/xml/4.3/docbookx.dtd"; [
>>> <!ENTITY % xinclude SYSTEM "/d:/Program
>>> Files/Oxygen/frameworks/docbook/dtd/xinclude.mod" >
>>> %xinclude;
>>> <!ENTITY % local.common.attrib "xml:base CDATA #IMPLIED" >
>>> ]>
>>> <section>
>>>     <title>Section Title</title>
>>>     <para>Text </para>
>>>     <xi:include href="f1.xml" 
>>> xmlns:xi="http://www.w3.org/2001/XInclude"/>
>>>     <xi:include href="f2.xml" 
>>> xmlns:xi="http://www.w3.org/2001/XInclude"/>
>>>     <section>
>>>         <title>Subsection Title</title>
>>>         <para>Text Text Text Text Text Text Text Text Text Text Text
>>> Text Text Text Text Text Text Text Text Text.</para>
>>>     </section>
>>>     <xi:include href="f3.xml" 
>>> xmlns:xi="http://www.w3.org/2001/XInclude"/>
>>> </section>
>>>
>>> f1.xml, f2.xml, and f3.xml are identical:
>>> <?xml version="1.0" encoding="UTF-8"?>
>>> <!DOCTYPE section PUBLIC "-//OASIS//DTD DocBook XML V4.3//EN"
>>>                          "http://www.docbook.org/xml/4.3/docbookx.dtd";>
>>> <section>
>>>     <title>Section Title</title>
>>>         <para>Text 33333333333</para>
>>> </section>
>>>
>>> For the XML parser the following features are enabled in the Oxygen 
>>> editor:
>>>
>>> http://apache.org/xml/features/validation/schema
>>> http://apache.org/xml/features/validation/schema-full-checking
>>>
>>> as well as XInclude processing.
>>>
>>> I hope you'll be able to reproduce the problem.
>>>
>>> Best regards,
>>> Zbyszek
>>>
>>>
>>> On 7/6/05, Bob Stayton <bobs@sagehill.net> wrote:
>>>
>>>> If all of these XIncludes are readable  files whose root elements are
>>>
>>>
>>>
>>> section elements, then this document should  validate.  I just tried 
>>> it and
>>> could not duplicate your problem.  Can  you describe how you are 
>>> validating,
>>> including the versions of the tools you are  using, as well as the error
>>> message you are getting?
>>>
>>>> Bob Stayton
>>>> Sagehill Enterprises
>>>> DocBook Consulting
>>>> bobs@sagehill.net
>>>>
>>>>
>>>>
>>>>
>>>> ----- Original Message ----- From:    Zbyszek    Cybulski
>>>> To: docbook@lists.oasis-open.org
>>>> Sent: Tuesday, July 05, 2005 12:20    AM
>>>> Subject: [docbook] XInclude problem
>>>>
>>>> Hi all,
>>>>
>>>> I got a problem with XInclude. I have a    document, root element is
>>>
>>>
>>>
>>> <section>:
>>>
>>>> <?xml version="1.0"    encoding="UTF-8"?>
>>>> <!DOCTYPE section PUBLIC "-//OASIS//DTD DocBook    XML    V4.3//EN"
>>>>                         "http://www.docbook.org/xml/4.3/docbookx.dtd";>
>>>> <section    id="summary_rep"> <!-- this is the root -->
>>>>     <title>Summary Report</title>
>>>>  <para>bla bla    bla.</para>
>>>>
>>>>  <xi:include href="F_subscr_info.xml"
>>>
>>>
>>>
>>> xmlns:xi="http://www.w3.org/2001/XInclude";></xi:include>
>>>
>>>>     <xi:include href="F_net_info.xml"
>>>
>>>
>>>
>>> xmlns:xi="http://www.w3.org/2001/XInclude";></xi:include>
>>>
>>>>     <xi:include href="F_net_status.xml"
>>>
>>>
>>>
>>> xmlns:xi="http://www.w3.org/2001/XInclude";></xi:include>
>>>
>>>>     <xi:include href="F_desktop_status.xml"
>>>
>>>
>>>
>>> xmlns:xi="http://www.w3.org/2001/XInclude";></xi:include>
>>>
>>>>  <section    id="summary_chrono">
>>>>           ...
>>>>  </section>
>>>> </section>
>>>>
>>>> and this document    validates and XInclude works perfectly. Note that
>>>
>>>
>>>
>>> included documents have also    <section> as the root element.
>>>
>>>> But I also have a different    document:
>>>>
>>>> <?xml version="1.0" encoding="UTF-8"?>
>>>> <!DOCTYPE    section PUBLIC "-//OASIS//DTD DocBook XML    V4.3//EN"
>>>>                         "http://www.docbook.org/xml/4.3/docbookx.dtd";>
>>>> <section    id="summary_rep"> <!-- this is the root -->
>>>>     <title>Summary Report</title>
>>>>  <para>bla bla    bla.</para>
>>>>
>>>>  <xi:include href="F_subscr_info.xml"
>>>
>>>
>>>
>>> xmlns:xi="http://www.w3.org/2001/XInclude";></xi:include>
>>>
>>>>     <xi:include href="F_net_info.xml"
>>>
>>>
>>>
>>> xmlns:xi="http://www.w3.org/2001/XInclude";></xi:include>
>>>
>>>>     <xi:include href="F_net_status.xml"
>>>
>>>
>>>
>>> xmlns:xi="http://www.w3.org/2001/XInclude";></xi:include>
>>>
>>>>     <xi:include href="F_desktop_status.xml"
>>>
>>>
>>>
>>> xmlns:xi="http://www.w3.org/2001/XInclude";></xi:include>
>>>
>>>>  <section    id="summary_chrono">
>>>>           ...
>>>>  </section>
>>>> <xi:include href="F_mail.xml"
>>>
>>>
>>>
>>> xmlns:xi="http://www.w3.org/2001/XInclude";></xi:include>
>>>
>>>> </section>
>>>>
>>>> I    can't validate this document though the included one is similar to
>>>
>>>
>>>
>>> those in    the above example. The error message says the last xinclude
>>> element is unknown    and the content of the section must match... 
>>> and here
>>> goes the list of allowed    elements. Those xincludes at the top are
>>> resolved but that at the bottom    isn't. What's wrong with this 
>>> document
>>> and why xinludes aren't allowed in the    place I chose.
>>>
>>>> Tks,
>>>>
>>>> Zbyszek
>>>>
>>>>
>>>
>>>
>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: docbook-unsubscribe@lists.oasis-open.org
>>> For additional commands, e-mail: docbook-help@lists.oasis-open.org
>>>
>>
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: docbook-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: docbook-help@lists.oasis-open.org
> 


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]