ubl-dev message

Subject: SV: SV: [ubl-dev] UBL- just how reliable are XSD based syntax checks?
From: "Bryan Rasmussen" <BRS@itst.dk>
To: "G. Ken Holman" <gkholman@CraneSoftwrights.com>,<ubl-dev@lists.oasis-open.org>
Date: Tue, 13 Feb 2007 11:28:15 +0100
I can't be 100% certain but I suppose what is meant that CAM can be used to
normalize namespace declarations, for example that all declarations are moved
to the document element, that conflicting prefixes (in scope or out of
scope?) are normalized to the first occurence of the prefix use and that if
there is two uses of xmlns="different namespaces here" the first use of a
default namespace declaration is the default namespace while the second use
has a prefix generated for it? Basically making the XML more readable from a
quick pass through a Generic CAM processor was what I was hoping it meant. 

so that the example from the spec 

<?xml version="1.0"?>
<!-- initially, the default namespace is "books" -->
<book xmlns='urn:loc.gov:books'
      xmlns:isbn='urn:ISBN:0-395-36341-6'>
    <title>Cheaper by the Dozen</title>
    <isbn:number>1568491379</isbn:number>
    <notes>
      <!-- make HTML the default namespace for some commentary -->
      <p xmlns='http://www.w3.org/1999/xhtml'>
          This is a <i>funny</i> book!
      </p>
    </notes>
</book>

becomes

<?xml version="1.0"?>
<book xmlns='urn:loc.gov:books'
      xmlns:isbn='urn:ISBN:0-395-36341-6'
      xmlns:cam1='http://www.w3.org/1999/xhtml'>
    <title>Cheaper by the Dozen</title>
    <isbn:number>1568491379</isbn:number>
    <notes>
      <cam1:p>
          This is a <cam1:i>funny</cam1:i> book!
      </cam1:p>
    </notes>
</book>

Cheers,

Bryan Rasmussen

-----Oprindelig meddelelse-----
Fra: G. Ken Holman [mailto:gkholman@CraneSoftwrights.com]
Sendt: 12. februar 2007 18:21
Til: ubl-dev@lists.oasis-open.org
Emne: RE: SV: [ubl-dev] UBL- just how reliable are XSD based syntax
checks?


At 2007-02-12 09:43 -0700, David RR Webber \(XML\) wrote:
>Good comments on namespaces - this definately mirrors my experiences -
>and nice links and examples. +1 here.

So you are applauding the use of namespaces, which are used liberally 
and usefully in UBL XML documents, but then you follow up with...

>What CAM does in fix conflicting default namespace declarations - is not
>change the XML itself - but creates a better reference structure with
>the conflicts resolved there.

This sounds like "translation" to me, and not working with original
documents.

And I'm not sure what you mean by "conflicting default namespace 
declaration" ... there is no definition of that phrase in the XML 
Namespaces specification (I just did a word search in 
http://www.w3.org/TR/2006/REC-xml-names-20060816 and found 
nothing).  The behaviour of default namespace declarations is 
well-defined ... an XML instance is or is not well-defined.  If the 
instance is well-defined, then any default namespace declarations are 
not "conflicting" (but then I don't know what you mean by "conflicting").

Can you specify this precisely, please?  What would be an example of 
such a conflict?

>This then acts as a "roadmap" and
>override to ensure accurate processing of XPath as the XML content
>itself is traversed.

When you say "XML Content" do you mean the original XML instance or a 
translation of that instance to a new instance?  Any such translation 
should go on behind the curtain so that all that is exposed is the 
original XML instances.

>Basically humans can glance at the XML and intuitively "know" what
>namespace the default belongs to at what point in the heirarchy - but
>of course java code - cannot make that call so easily - so jaxen
>especially - needs fully qualified content - you cannot mix and match
>default and non-default namespaces.

Then that is a problem with a tool, not with XML technology.  A 
well-formed instance is specified ... XSLT tools work with 
well-formed XML instances regardless of any valid use of 
namespaces.  As far as I know, SAX and DOM interfaces are also 
well-defined in support of the valid use of namespaces.

>Also we've seen instances where
>the same default namespace in-line declaration is made in multiple
>places - this "works" but its not correct - they should be unique.

Can you please give an example?  If the document is not well-formed, 
then it isn't XML.  If it is well-formed, then an XML tool should be 
able to work with it untouched.  Passing on the processing burden to 
users by forcing translation does not seem to serve them well.

>Paradoxically the worst XML I've seen is the one I've had to work with
>for two years as part of Grants.gov applications!  This falls under the
>category of - if people can mess it up - they will...

So they aren't publishing well-formed XML documents?  Then these 
documents shouldn't be called "XML".

>So - with all this experience - I prefer simple XML if I can get it!

There's that "simple" term again.

>But as you say - how to relate this back to the original definitions
>and vocabulary?  The <Reference> section in the CAM template is
>designed to provide that using UID references to the (UBL) domain
>vocabulary items.  So you don't need the namespace equating.  This is a
>win on multiple levels - including thinner XML payload sizes - faster
>processing, etc.  And also - being able to use non-english localization
>tagnames / or abbreviations - but equate them exactly to their english
>equivalents in the standard [now there's a radical concept... ; -)]

But the UBL names, while they happen to be English, are meant to be 
mnemonic ... a UBL instance has to use those names at the document 
level for document-level compatibility (I'm not talking model-level 
compatibility).  One doesn't see the English terms in the Java 
language being made available in a translated localized version of 
the Java language ... and for the same reason one doesn't see 
localized tag names in the work products of UBL Localization 
committees.  A UBL instance uses the established mnemonics for 
interchange purposes at the document level.

User interfaces are welcome to provide translated localized labels of 
the information, but the instances those interfaces create need to be 
validated at the document level to be called "UBL instances".

>We still need to do some work here though - we know it can work - but
>CEFACT is still visiting on the XSD for CCTS in registry - once that is
>definitive - then we'll be able move forward on defining the formal
>normative specification for this.

In my opinion it isn't "working" if it involves translation that is 
exposed to the user or that explicitly burdens the user to work with 
the information.  You can do what you want if the user doesn't see 
it, but a "translated" UBL instance would be model compatible but not 
interchange compatible.

I'm responding to your post because I think new users of XML and UBL 
may get confused by some of the things you've said.  I'm trying to be 
precise about terminology, which is why I brought up the thread on 
Steve's post.  In your post here you are bringing up terminology that 
I'm not familiar with when you say "conflicting default namespace 
declarations".  If there isn't a formal definition of this, then it 
shouldn't be the basis of justification for extra burden.

Using precise terminology will be important for readers of the 
archive and new users of this technology, so they don't get confused 
when different participants use the same term to mean different things.

I hope this is considered helpful.

. . . . . . . . . . . . . . Ken

--
World-wide corporate, govt. & user group XML, XSL and UBL training
RSS feeds:     publicly-available developer resources and training
G. Ken Holman                 mailto:gkholman@CraneSoftwrights.com
Crane Softwrights Ltd.          http://www.CraneSoftwrights.com/u/
Box 266, Kars, Ontario CANADA K0A-2E0    +1(613)489-0999 (F:-0995)
Male Cancer Awareness Aug'05  http://www.CraneSoftwrights.com/u/bc
Legal business disclaimers:  http://www.CraneSoftwrights.com/legal


---------------------------------------------------------------------
To unsubscribe, e-mail: ubl-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: ubl-dev-help@lists.oasis-open.org
References:
- RE: SV: [ubl-dev] UBL- just how reliable are XSD based syntax checks?
  - From: "G. Ken Holman" <gkholman@CraneSoftwrights.com>