OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

docbook message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [docbook] Re: [DBX5] Is this a DocBook document?


Norman Walsh wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> / Tobias Reif <tobiasreif@pinkjuice.com> was heard to say:
> | Processing tools such as validators need to be able to find out the
> | main language of a document, so that they can know which schema(s)
> | apply.
> 
> Another view is that schemas are "out of band" for validators. A
> document can be validated (even successfully!:-) under many different
> schemas.

Sure! There's a misunderstanding, we don't disagree AFAICS. The whole 
thing comes from the (hopefully temporary) problems that arise when a 
document doesn't reference a schema (as they did/do with DTD doctype 
declarations). I think a document should *not* have to reference a 
schema, thus I'm trying to find a way for my validator to find out which 
schema(s) to use when validating a document which doesn't reference a 
schema (the same question you posed in your blog AFAICS).

> | The namespace plus a version attribute seems to be a possible solution.
> 
> Versioning is hard. I humbly point to some ongoing work in the TAG
> on this subject: http://www.w3.org/2001/tag/doc/versioning.html

I'll check this, thanks for the link.

But a version attribute doesn't have to solve all problems of versioning:

SVG for example has a version attribute which seems to work well so far 
(same namespace name across versions), eg:

   <svg xmlns="http://www.w3.org/2000/svg"; version="1.2">...

   <svg xmlns""http://www.w3.org/2000/svg"; version="1.1"
     baseProfile="tiny">...


http://www.w3.org/TR/SVG11/struct.html#SVGElement :

"version = "<number>"

Indicates the SVG language version to which this document fragment conforms.

[...] For SVG 1.1, the attribute should have the value "1.1"."


> | So I propose that the next DocBook XML be in a namespace, and that
> | each DBX document should have a required version attribute on it's
> | root element, plus an optional profile attribute.
> 
> Putting DocBook in a namespace is a possibility. It's going to be
> hugely backwards incompatible, though. It's not a step I'd want to
> take lightly.

Absolutely.

> I have mixed feelings about a version attribute. There are lots of
> DocBook documents that are valid under many different versions. If I
> write a 4.2 document today and it's still valid under 4.3, what does
> version=4.2 mean?

It means that the document is written in 4.2, and the assertion holds 
true. If the assertion that the document is written in 4.3 also holds 
true, then this doesn't invalidate the first assertion.

If a document is written in a subset of 5.3 that is also a subset of 
5.2, then that's not a problem for my validator or transformation tool, 
AFAICS.

The problem for which I need a solution is the following:

My validator or transformation tool need to know what kind of document 
it is dealing with, so it knows if it can handle it.

If the document is in the DocBook namespace and has a version attribute 
with value "5.0" then the validator can validate the document against 
(a) DBX 5.0 schema(s), and the transformation tool can generate XHTML etc.

If the transformation tool gets a document with a version attribute 
higher than 5.0, it can raise an error "I only support DocBook up to 
version 5.0, please try a later version of me."

(If the author knows that the doc is written using a subset of 5.3 that 
also is a subset of 5.0, and he wants to process it using tools which 
only support 5.0, then he can change the version attribute to specify 
"5.0" or ask the tools to ignore the version attribute specifying "5.3".)

> | Ideally this should be standardized on the XML level:
> |
> | 1.
> | xmlns=""
> | (exists)
> 
> That's putting DocBook in "no namespace". That's not quite the same as
> xmlns="http://www.oasis-open.org/docbook/"; or something like that.

I know.

Just as
   xml:version=""
below,
   xmlns=""
stands for
   xmlns="[value here]"

xml:version="" was meant to mean
"This is an examlple of how to put the whole document in a namespace. It 
could also done using a myriad of alternative ways, for example by using 
prefixes or mixing them with default namespace declarations."

Sorry for the confusion.

> | 2.
> | xml:version=""
> | (doesn't exist yet)
> | especiall needed if the language is not version 1.0 or if the
> | namespace name doesn't contain version info (eg is the same for all
> | versions, which I prefer).
> 
> I don't think xml:version is very likely.

How can my XSLT find out if it supports the version of DocBook that the 
document is written in? Various languages might choose different names 
for the version attribute (as with profile and SVG's baseProfile), thus 
standardization would help

When the document doesn't have a DTD document type declaration (no FPI 
etc), then how can tools find out which language is used? What schema(s) 
should my validator apply?

If there wil be no "xml:version" attribute, then I still see the need 
for a "version" attribute in DBX5+.

> | 3.
> | optional xml:profile=""
> | (doesn't exist yet)
> | Useful if a subset is used.
> 
> How does profile differ from version and namespace?

The optional profile attribute specifies the profile, which is a subset 
of the language identified by the namespace name and the version attribute.

> What sorts of values does it hold?

SVG for example:

http://www.w3.org/TR/SVG11/struct.html#SVGElement

"baseProfile = profile-name

Describes the minimum SVG language profile that the author believes is 
necessary to correctly render the content. The attribute does not 
specify any processing restrictions; It can be considered metadata. For 
example, the value of the attribute could be used by an authoring tool 
to warn the user when they are modifying the document beyond the scope 
of the specified baseProfile. Each SVG profile should define the text 
that is appropriate for this attribute.

If the attribute is not specified, the effect is as if a value of "none" 
were specified."

http://www.w3.org/TR/SVGMobile/

http://www.w3.org/TR/SVGMobile/#sec-structure

"
<?xml version="1.0" standalone="yes"?>
       <html xmlns="http://www.w3.org/1999/xhtml";
                xmlns:svg="http://www.w3.org/2000/svg";>
       <head>
           <title xml:lang="en">Sample XHTML + SVG document</title>
       </head>
       <body>
          <svg:svg width="4cm" height="8cm" version="1.1" 
baseProfile="tiny" >
              <svg:ellipse cx="2" cy="4" rx="2" ry="1" />
          </svg:svg>
       </body>
       </html>
"

"The 'baseProfile' attribute on the outermost 'svg' element must have 
the value "tiny" for SVG Tiny content, and "basic" for SVG Basic 
content. The 'baseProfile' attribute on nested child 'svg' elements is 
ignored. The SVG 1.1 specification states that the 'version' attribute 
of the outermost 'svg' element in SVG 1.1 content must have the value 
"1.1"."


But the profile attribute would be optional, and would only really make 
sense if the DB TC itself would specify profiles (subsets).

> | Catalog entries could look like this:
> | (SVG is used as example since it is namespaced, uses a version
> | attribute, and also shows the requirement for specification of the
> | profile attribute)
> |
> | <language
> |    name="SVG"
> |    ns="http://www.w3.org/2000/svg";
> |    version="1.1">
> |    <schemas>
> |      <schema
> |        official="yes"
> |        schema-lang="DTD"
> |        location="http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd"/>
> |      <schema
> |        official="no"
> |        schema-lang="RNG"
> |        location="http://www.w3.org/Graphics/SVG/1.1/rng/svg11.rng"/>
> |    </schemas>
> | </language>
> 
> This looks more like a RDDL document than a catalog.

OASIS catalogs don't only map online URLs to local paths, but also FPIs 
to schemas. The latter is the reason why I use the name "catalog" for 
resources that map language identifiers (or language descriptions) to 
schemas.

> Catalogs suffer from the limitation that the resolver doesn't really
> know why the resource is being retrieved, so simple URI comparison is
> about all it can do.

My validator needs to know which schema(s) to use when validating XML 
documents that don't reference a schema (eg no doctype declaration 
present). Here's what I might use for now:
(draft of a draft, most likely includes errors and design flaws :)

<?xml version="1.0"?>
<catalog xmlns="http://www.pinkjuice.com/catalog/"; version="0.1">
<!--
validator:
1. try to find out lang, look for local schema(s), validate
2. else DTD doctype declaration / OASIS catalog route
-->
   <language name="SVG 1.1">
     <doc>
       <has>namespace-uri(/svg)='http://www...'</has>
       <has>/svg/@version='1.1'</has>
       <has>not(/svg/@baseProfile)</has>
     </doc>
     <schemas>
       <official>
         <schema language="DTD">
           <home>http://</home>
           <local>file://</local>
         </schema>
       </official>
       <inofficial>
         <schema language="RNG">
           <home>http://</home>
           <local>file://</local>
         </schema>
       </inofficial>
     </schemas>
   </language>
   <!--
   put DBX5 in a namespace, require version attr on root
   element, add optional profile attribute
   -->
   <language name="DBX 5.0">
     <doc>
       <has>namespace-uri(/*)='http://...'</has>
       <has>/*/@version='5.0'</has>
     </doc>
   </language>
   <!-- ... -->
</catalog>

I'm mostly dealing with documents containing no or tiny portions from 
other namespaces which can be pragmatically ignored (in my scenarios) 
when validating, I try to keep stuff simple.

The above "catalog" mechansim is not a proposal, it's just a draft for a 
local solution.
But it should demonstrate that putting DBX5 in a namespace and requiring 
a version attribute on the root element of each document (that's my 
request/proposal) would be useful for humans and for various types of 
tools such as validators.

If we want to escape reliance on DTD, we need to find new ways for 
providing the information that we provided via DTD doctype declarations. 
Namespace name and version attribute will also have the advantage of 
being accessible from XSLT, thus are also useful in documents which do 
have a doctype declaration (and ns/version can also be used with inlined 
fragments).

Tobi

-- 
http://www.pinkjuice.com/



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]