OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

dita message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [dita] Groups - New Action Item #0023 Embedding version numbersin cat...


The general requirement is to be able to distinguish different versions 
of the DITA-defined DTD and schema components such that multiple 
versions can co-exist within a single access space and such that there 
is no ambiguity about which version a particular document requires.

In addition, there needs to be the ability to refer to DTD and schema 
components using a form that means "the latest version available".

[For the purposes of the rest of this discussion I will use the term 
"schema component" to mean the files that make up a schema, regardless 
of that schema's implementation form: DTD, XSD, etc.]

Schema components are files and therefore are ultimately addressed by 
some form of "file name" or "external identifier", in XML terminology.

In Web terms schema components are resources addressed by URI. However, 
the DITA materials and most users of them retail the pre-XML use of 
public identifiers to point to schema components. Regardless of the 
initial form of reference, all references to schema components must be 
resolved to actual files or storage objects. In the case of the DITA 
materials as provided by OASIS, this translates into the locations of 
files within the DITA distribution.

Another related issue is the relationship between namespace URIs and 
schema component URIs.

It is generally accepted, and appeared to be the concensus of the DITA 
TC, that a namespace identifies an *application*, which is a pure 
abstraction and, as such, so no sense in which can be versioned.

However, at any point in time, the application will be represented by 
some set of concrete implementing artifacts that are versioned. In the 
case of DITA, these artifacts include the various schema components, 
supporting documentation, processor implementations, and so forth.

Thus, the general practice is that namespace URIs are invariant and do 
not include any sort of version identifier. That is, DITA is DITA in all 
its versions. However, there will certainly be many versions of the 
implementing DITA components. [This issue can be clouded by the use of 
numbers to distinguish related but distinct applications, i.e., "DITA 1" 
and "DITA 2". In the context of this discussion "DITA 1" (i.e., the 
current form of DITA on which the TC is working) is a completely 
distinct application from "DITA 2", which will be developed at some 
future time. This subtle distinction can of course lead one to tedious 
existential discussions. For now just take it as given that the 
foregoing is true.]

This distinction is important in part because of the way that XML 
documents can be associated with schemas using the mechanisms defined in 
the XSD specifications and in part because it can help to avoid 
confusion about what is and isn't versioned and therefore how names 
should be constructed.

Another important practical consideration is the general need for access 
to schema components on a local system with locally-defined names and 
locations for files, as opposed to always accessing components from some 
central, public server. This consideration is addressed most generally 
by the OASIS XML Catalog mechanism, which provides standard mechanisms 
from mapping from one form of external identifier to another, ultimately 
to some form that can be resolved by the processor using the catalog 
(i.e., read from the local file system).

Thus, the situation we find with DITA is:

- The use of both DTDs and XSD schemas for documents

- For DTDs, the use of public IDs and non-absolute system IDs to point 
to declaration sets

- For XSD schemas, the use of absolute and non-absolute URIs to point to 
schema components.

- The shipment of XML catalogs as part of the base DITA implemetnation 
materials

The questions then are what names are needed and how should they be used?

The names needed are, I think, the following:

- For the DITA package itself, packaged as a tree of files, a top-level 
directory that includes a distinguishing version identifier, i.e. 
"dita_1.0/", "dita_1.1/", etc. Within the package, the locations of and 
names of individual files should be changed as little as possible from 
version to version in order to simplify XML catalog maintenance and 
forestall confusion in users and implementors who move from one version 
to another.

- For each version of each schema component:

   - A normative absolute URL that unambiguously names that component, 
distinct from all other components and from all other versions of that 
component

   - A normative public identifier that unambiguously names that 
component, distinct from all other components and from all other 
versions of that component

- For each schema component considered as a "resource" (a collection of 
versions):

   - A normative absolute URL that identifies that resource with the 
implicit semantic of "the latest version available".

   - A normative public identifier that identifies that resource with 
the implicit semantic of "the latest version available".

   - A conventional filename for the component within the tree of files 
provided in the DITA implementation package

Note that the public IDs and absolute URLs are synonyms, although the 
absolute URLs have the advantage that they could be made resolvable. 
Public IDs by contrast must always be mapped to a system ID.

[Opinion: Public IDs are an anacronism that are not useful in XML and 
should be replaced with absolute URIs in all cases. However, I realize 
that some systems seem to depend on public IDs and that therefore there 
is resistance to this move. However, I find the use of public IDs 
potentially problematic because of the inherent ambiguity in XML of 
whether to prefer the public ID or system ID when resolving external 
identifiers. URIs present no such problem.]

Given the above, the DITA catalog for a given version of the 
distribution must then provide the following entries:

- Mappings from the version-specific public IDs to their corresponding 
system IDs (which could be to the absolute URLs)

- Mappings from the version-specific absolute URLs to their 
corresponding files in the DITA distribution package

- Mappings from the version-independent public IDs to their 
corresponding system IDs (which could be to the absolute URLs)

- Mappings from the version-independent absolute URLs to their 
corresponding files in the DITA distribution package

- Mappings from the DITA-defined namespaces that have governing XSD 
schemas to those schemas.

I have no particular opinion or insight about the details of the 
version-specific identifiers themselves. It doesn't really matter as 
long as the version information is clear.

Cheers,

Eliot
-- 
W. Eliot Kimber
Professional Services
Innodata Isogen
9390 Research Blvd, #410
Austin, TX 78759
(512) 372-8155

ekimber@innodata-isogen.com
www.innodata-isogen.com



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]