entity-resolution message

Subject: locating schemas via public ids

At 10:51 2001 01 22 -0800, Lauren Wood wrote:

>Issue 8:
>. . .
>Resolution: Norm will fix the system ids. Norm will request a public ID
>for the schema DTD and for the schema or schemas from the Schemas WG.

Done at [A].  I note that there are some in the Schema WG that are
not as convinced as we that public ids are a good thing, so we may
need to do some missionary work here.

>The hint for a schemaloc is currently can not be a public ID. This
>should go into the issues list.
>Action on Paul: write up the position that this committee may take.

This appears to be recorded in our issues document [B] as
  Issue 13.  Public identifiers for schema locations.

[A] http://lists.w3.org/Archives/Member/w3c-xml-schema-ig/2001Jan/0219
[B] http://www.oasis-open.org/committees/entity/issues.html

Below is what I'm suggesting we submit to the XML Schema WG.  We'd
need to send it soon, so please give comments asap.  

Also, to what extent do we need to check with the OASIS AC rep before 
we send this to the XML Schema WG.  And, since OASIS is a W3C member, 
but OASIS is an outside organization, should we be involving the XML CG 
in this interchange or not?

My suggested answers:  Lauren should tell Laura what we're doing and
we should plan to cc her on the email we send.  We (Lauren or her
designate--I'm willing if designated) should inform the XML CG and 
cc them on the email to XML Schema.

The OASIS Entity Resolution Technical Committee (OERTC) [1] is
chartered to developed an entity resolution catalog format in XML
(XML Catalog or xmlcat).  The purpose and functionality of this 
catalog format is to cover that which the SGML Open/OASIS TR9401 [2] 
Entity Management Catalog did, but using XML instance syntax and 
tailored for use with XML.

Many implementors and users have found public ids and entity
management catalogs to be very usefulness in practical situations
ranging from individual use to major production environments, and
there is a desire to be able to use such techniques for accessing 
XML resources, especially "public" resources such as published
DTDs and Schemas.  The OERTC has the support of several implementors,
and its work has received interest from the xml-dev community.

During our work, we realized that the current Schema Structures
draft appears to make it impossible to provide schema-locating
hints using anything other than URIs.  Specifically, public
identifiers [3] could not be used as the spec is currently written.

In Structures, 6.3.2 How schema definitions are located on the Web [4], 
it says:
  [xsi:schemaLocation] records the author's warrant with pairs of
  URI references (one for the namespace URI, and one for a hint as
  to the location of a schema document defining names for that
  namespace URI). [xsi:noNamespaceSchemaLocation] similarly provides
  a URI reference as a hint as to the location of a schema document
  with no targetNamespace.

The problem is that each member of schemaLocation and the value of
noNamespaceSchemaLocation is required to be a URI.  Furthermore,
the members are undelimited and separated by spaces, and public
identifiers can contain spaces.

The XML Catalog (as did TR9401 before it) would allow a user
to locate a resource using all the information that might be 
known about it (name, system id, public id), and certainly schema 
resources will be given public ids (several have already [5]).  But 
this only works if there is some way to include that information in 

Therefore, the OERTC asks [6] that the XML Schema WG make allowances
in schemaLocation for specifying both public and system ids [7].


[1] http://www.oasis-open.org/committees/entity/
[2] http://www.oasis-open.org/committees/entity/9401.html
[3] http://www.w3.org/TR/REC-xml#NT-PubidLiteral
[4] http://www.w3.org/TR/2000/CR-xmlschema-1-20001024/#schema-loc
[5] http://www.oasis-open.org/committees/entity/ident.html#schema
[6] OASIS is a W3C member organization
[7] Not to presume to constrain a solution to the problem, but
    even something such as the following might be acceptable:
    Say that each member--that is currently a URI--is optionally
    delimited by quotes and consists of either a SystemLiteral
    (production 11 in XML 1.0) OR the second half of the
    disjunction in production 75, to wit:
      'PUBLIC' S PubidLiteral S SystemLiteral
    (which, since it contains spaces, would necessarily be quoted).

