OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

oasis-member-discuss message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: OASIS Staff comment on ASIS

OASIS Staff comment on ASIS

  Artifact Standard Identification Scheme for Metadata 1.0
  Approved TAB Document 30 January 2006

Staff appreciates the time that the TAB has put into this document,
but as we've previously shared with the TAB, we reached the same
conclusion in December that many other commenters have expressed:
this document is not complete or finished enough to be enacted as
policy. If any mandatory rules are enacted, they should be in the
form of much shorter and clearer guidance.

With gratitude for the substantial amount of prior work on this
artifact and file naming issue, we offer the following comments as
input to any revision and redesign process.


[57] and passim in page footers

ASIS: "Copyright (c) OASIS Open 2005. All Rights Reserved."

If this document is published in 2006, the copyright dates
should be current ("2006") in line 57 and in the footers

[129-130] TC-defined names approved by the OASIS TC Administrator

ASIS: "TC-defined unambiguous and descriptive names
are also permitted, if approved by the OASIS TC Administrator"

We question whether pre-approval is logistically infeasible. According
to this ASIS draft, TCs may use either the structured (componentized)
names or the freeform (TC-defined) name pattern (alphanumeric + hyphen).
Faced with this choice, we think TCs will often elect to use
the TC-defined naming scheme -- because it offers what they will
desire as a filename.  However, we want TCs to be able: (a) to
freely assign filenames [said to be to derived from artifact names
or artifact identifiers] and (b) to upload those files to the Web
server without any OASIS TC Administrator approval process.
Insinuating the TC Administrator into a "name approval" process
will not scale.

In exceptional cases, TC Admin may have to delete (and replace with
a disambiguation page) a file that is self-loaded to a highly 
inappropriate or misleading URI, but in most cases we expect TC
editors to be able to self serve.  A staff prior-approval loop
is unattractive, as it is too likely to be non-scalable and a
source of delay.

[230] Definition and function of Artifact Identifier string

ASIS: "Artifact Identifier: A string used to uniquely identify a
      particular artifact. [230]
      "TC-defined unambiguous and descriptive names" [129]
      "unambiguous names [141]

What's the difference between "uniquely identify" and
"identify a [unique] particular artifact" ?

We think we understand the ASIS goal of providing a string to
"uniquely identify" a particular artifact, but it does seem
that the Artifact Identifier string in all cases achieves
this.  Whereas URIs identify distinct resources, the
Artifact Identifier, sometimes used at a higher level of
abstraction than files which physically instantiate artifacts
at the machine level, introduces a fuzzy notion about
the relationship between the identifier string and the
predictable representation.

Example from the given pattern:


Derived filenames (see line 480: "The filename MUST be the
ArtifactIdentifer followed by the optional literal period
and form") are, among others:

saml-v7.0-spec-wd-02-de.zip  /* includes schemas and wsdls */
saml-v7.0-spec-wd-02-de.tar.gz  /* includes UML diagrams also */

ArtifactIdentifer: saml-v7.0-spec-wd-02-de

Sample use case: someone reports a typo "salm" for "saml"
in "saml-v7.0-spec-wd-02-de."  Where do we look?  It turns out
that the typo, introduced manually, is only in the ".odt"
artifact. That artifact of interest arguably is not uniquely
identified by the string "saml-v7.0-spec-wd-02-de".

The Artifact Identifier string "saml-v7.0-spec-wd-02-de" thus
seems to fall short of providing unique identification for an
artifact.  We suggest that the definitions should be revised and/or
that further justification be given to the notion of
Artifact Identifiers as distinct from filenames -- which are
used directly to compose URIs in the general case.

[275-276]  Date format
[378-379] format YYYYMMDD

ASIS: "Date: The date of the artifact, in the format YYYYMMDD."

The ASIS document itself displays "30 January 2006" as the
publication date.  We are puzzled as to the motivation for
the YYYYMMDD date format, since the 'Date' metadata element does
not occur in the Artifact Identifier (OASISDefinedName format)
 -- nor in the URN document-id, nor in the schema name, nor
in the OASIS Standard.  So: Date is said to constitute required
metadata, but the document provides no indication of a context
within which that datum would be encoded.

Line 378 "Each artifact MUST have an associated string value
for the Date of the artifact." does not indicate where or
by what means 'YYYYMMDD" is to be "associated" with an
artifact.  Unless convincing use cases can be cited to
justify this (uncommon) format, we think TCs should be
able to use date formats of choice, or one of the
standard formats, as context demands, per ISO.

[283-284] PDF and HTML forms

ASIS: "... when submitting a Public Review package, the
specification(s) must be provided in both Adobe Acrobat
(pdf) and HTML forms as required by [OASIS TCP]."

While the central goal of ASIS apparently is not to levy new
requirements against the current TC Process document
(but rather, to comply), we take this occasion to
express agreement with other reviewers who have stated
a desire to make XHTML the (sole) normative format for
OASIS specifications.  Templates are currently provided for
XHTML "transitional."  We feel that some of the most
important goals for automation, spec QA, and searching will
not be attainable (feasibly) unless TC specifications are
published in XHTML, spec-XML, or equivalent format.

[351] OASIS Document Templates

ASIS: "The OASIS Document Templates for text specifications
SHALL be updated to include the metadata..."

We question whether this directive belongs in the document:
OASIS has templates for some of the proposed artifact types,
but not for others. However, staff will bring and keep all
templates into alignment with all policies and guidelines
OASIS issues, including any part of this document that
may become policy.

[399] Artifact Identifiers and unique spellings

ASIS: "TCs SHALL NOT create two or more Artifact Identifiers
that differ only with respect to case."

We suspect that this rule needs to be re-written to provide
scope:  e.g., "... within a given directory."  The origin of this
rule was apparently (?) a concern that while the OASIS servers
all handle mixed case faithfully [Unicode], rare situations
might arise in which data could be transferred to some system
that used non-case-respecting software, possibly resulting in
overwritten files or user confusion.  Virtually all modern
filesystems store information in case-sensitive [Unicode]
representations, but not all applications adhere.

More generally, however, this rule raises the question as
to whether ASIS is attempting to ensure that no TC can create
two or more (character-wise) different artifacts having the
same filename -- e.g., 'CATALOG' files for successive stages
of a specification, each bearing the filename 'catalog' and
living in a version-labeled directory.  While the URIs would
obviously be different, the filenames would be identical.
We think identical filenames at different URIs is acceptable
as well as expected.

ASIS seems to want all Artifact Identifiers to serve as unique
identifiers, and to require derivation of filenames from
Artifact Identifiers. In practice, we cannot believe
that TCs will want to change the spelling of every filename
with every new release. Please see comment ad line 480

[409] and [746] underscore

ASIS: "... underscore (Low Line)..."

We understand that the TAB's draft documents moved back and
forth on the use of underscore as an allowable or inadvisable
name character.  Respecting the legitimate differences of
opinion (taste) and perception of the tradeoffs, we do not
foresee that the adoption of a restricted character inventory
for names without underscore will greatly change the equation:
users are now accustomed to including a range of characters in
filenames that are commonly deprecated in various application
contexts: space, comma, ampersand, parenthesis, tilde,
pound-sign, dollar-sign, square-bracket, plus-sign, etc.

These characters, and all control characters, are disallowed
in the ASIS draft as filename characters because they are
known to create cascading problems in data fidelity, at least
under some common conditions.

Similarly, because underscore (Low Line) is an ambiguous
character, indistinguishable from other non-displayed BLANK
characters in certain visual contexts, we do not think it
should be allowed in URIs for artifacts (hence, not in filenames).

[424-425] OASISdefinedName an option, not a requirement

ASIS: "TCs MAY use a TCdefinedNames (which need not follow
the rules for OASISdefinedNames) subject to approval by the
TC Administrator."

We are concerned that the current ASIS draft as written does
not clearly reveal to the reader that use of the OASISdefinedName
is an option, not a requirement.  As a simple example, line
466-67 should say "... if the TC elects to use the
OASISdefinedName, it MUST contain..."

More generally, the document needs to be much clearer about
the degree of "requirement": The captions at lines 35-38 and
115-118 suggest that the entire document is suggested as a
recommendation; however, the language and tone throughout
the draft is that of mandate, not recommendation.

[427-428] Constructing Specific Artifact Identifiers

ASIS: "The following format SHALL be used for OASISdefinedNames.
This format includes selected metadata in a consistent format;
variations for specific purposes are described..."

We appreciate the tremendous amount of work that went into 
identifying the requisite metadata to be captured for each TC
artifact [338-339].  It is unclear, however, what additional
benefits are to be gained from creating and citing the
concatenated string in addition to each separate component
which is said to be an "associated" datum; we recommend
consideration of dropping this requirement.

[421] Alternation between OASISdefinedName and TCdefinedName

ASIS: "An ArtifactIdentifier MUST be either an OASISdefinedName
or a TCdefinedName."

The draft ASIS document [e.g., line 232] identifies a goal of
using structured names (concatenated sub strings) in order to
provide a basis for parsing such artifact identifiers. We
understand that goal, but feel that the value of such parsing
is compromised by allowing TCs to *sometimes* use componentized
forms [OASISdefinedName] and sometimes, to use TCdefinedName
instead. For some artifact types it will be difficult for a
machine to determine whether a given ArtifactIdentifier is by
intent a TCdefinedName or a possibly malformed OASISdefinedName.

Consideration should be given to design of a unified approach;
if this proves intractable or undesirable, mechanisms should be
specified to permit the encoding of hints about the type of
ArtifactIdentifier being used, as an aid to parsing and other
machine processes.

[334-335] tcShortName hyphen removal

ASIS: "TC Short Name: The short name assigned by the TC
Administrator to the Technical Committee, with any
hyphens eliminated." 

We do not think the elimination of hyphen in tcShortName
is motivated or required in the current design, with the
possible exception of its use in connection with URNs, per
RFC 3121. In other use cases, and especially in
connection with URIs rooted at http://docs.oasis-open.org/,
we think the tcShortName should include the hyphen.
Since [line 433] "The tcShortName is not included" in the
format for OASISdefinedName ("as it can be determined
uniquely from the product"), the possible benefits for
ease of parsing are small compared to the difficulties
caused by forking the canonical spelling of assigned TC
short names.

[348-349]  Additional metadata

ASIS: "The Technical Committee MAY define additional
metadata for its artifacts, provided those metadata names
and values are approved by the TC Administrator."

The context for (formal) usage of the "additional metadata"
needs to be clarified such that we know what usages are
prohibited, or possibly prohibited, if they fail to meet
the approval of the TC Administrator.  Surely the document
cannot prohibit the definition of new artifact metadata
by TCs (per se).

[318] descriptive name

ASIS: "TC-defined unambiguous and descriptive names" [129]
      "descriptive name defined by the TC for the artifact" [318]
      "The descriptive name of the specification" [321]

The discussion about "TC defined Name" in 317-333 is not clear.
Line 129 says "TC-defined unambiguous and descriptive names
are also permitted," suggesting that TC-defined names and
descriptive names are two different things.  However, lines
317-318 seem to imply that a 'TC defined Name' *is*
'A descriptive name defined by the TC for the artifact.'

In the example:

* what is a "container", exactly?  A directory?
* WSRP 1.0 and SAML 2.0 -- are these "descriptive names" ?
* what about 'saml-2.0-AuthnContext-schema-os' - is that
  part of any URI?

[437-440] Omission of "stage" component

ASIS: "A value for Stage and the following hyphen separator MUST
be included except in the following cases: - when ArtifactType is
schema (or) when ArtifactType is wsdl", in which case a value
for Stage MAY be omitted."

We do not understand the justification for special treatment of
"schema" and "wsdl" artifact types; other types (e.g., catalog)
might be even better candidates, were the goal to alleviate the
burden of encoding a "stage" component. If the design for
structured names is retained and mandated, exceptions like this
should be resisted.

[446] Use of 'form' component

ASIS: "A value for Form SHALL be included for files and
final URI components that resolve to a specific artifact,
and SHOULD NOT otherwise be present."

We do not see the benefit or necessity of these rules: there
are well-established use cases for "final URI components"
which end in "slash" or other character strings not matching
literal "." + "form".  Certainly RDDL documents and other
namespace documents are one class of exception, but we
envision others as well.

[480] Derivation of filenames from ArtifactIdentifers

ASIS defines a close relationship between an artifact identifier
(string) and the filename associated with the artifact: the
"filename MUST be the ArtifactIdentifer followed by the
optional literal period and form".

We do not think this is necessary or necessarily desirable.
We prefer a scheme in which the URI path portion 'above' the
filename  reflects key metadata elements -- which allows TCs
greater liberty in assigning filenames.

Thus, TCs should be free to use structured (componentized)
names as filenames (based upon OASISDefinedName or TCDefinedName),
but they should not be required to do so: filenames should
not be required to "be" the ArtifactIdentifer followed by...

480 The filename MUST be the ArtifactIdentifer followed by the optional 
literal period and form
520 The filename MUST be the ArtifactIdentifer followed by the optional 
literal period and form
529 The filename MUST be the ArtifactIdentifer followed by the optional 
literal period and form
537 The full ArtifactIdentifer followed by the optional literal period and 
form MUST be the filename.

[482] Document titles

ASIS: "The filename MUST bear a reasonable and descriptive
relationship to the document title."

Section 6.3 "Other Artifact Filenames" seems to concern
artifact types other than prose specifications and other
prose documents.  For example (we assume) catalog, schema,
wsdl.  But such documents frequently do not have "titles"
as such.  We do not think users will be able to apply
the rule in line 482 in such cases.

Example: does the filename 'b-2.xsd" bear a "reasonable
and descriptive relationship to the document title"? See:

[491-493] Default Web Pages

ASIS: "6.4.1 Default Web Pages for Product URIs: The relevant
required metadata for an artifact MUST be maintained at the
default index page for the http scheme URI for each product
and productVersion to facilitate search and retrieval. For
each such index page, an XHTML-compliant meta element MUST
be included..."

The prescriptions in 6.4-6.5 should be simplified to indicate
that metadata must be associated with each artifact in
a manner appropriate to the artifact type, in accordance with
the OASIS-provided template(s) for each type, located
at http://docs.oasis-open.org/templates/ . Any revised (ASIS)
specification should include the link for the OASIS template
page in this and any other place it's mentioned.  

[570-571]  Schema sub-types

ASIS: "It is RECOMMENDED that only the following sub-types
be used and only when the type is schema: dtd, rng, and xsd."

We do not think a blanket recommendation should be given
deprecating sub-types other than the three named.  We 
fully expect that new "types" will become common as
schema languages mature (e.g., DSDL languages).

Further, this passage raises the broader question of the usefulness
of "schema", since in its typical illustrated use cases, no 
distinctions are made between schema types, or indication
as to whether a DTD is based upon SGML or XML, etc.  Since
the file extension itself may be of negligible value in
conveying information about a schema type [.xsd, .rng,
 .dtd, .rnc, .<other> files in a .ZIP archive], it seems
that this design needs further work.

This touches upon the matter of ArtifactIdentifiers as
useful for unique identification of an artifact.  A
filename matching an ArtifactIdentifier for a schema
might be, in structured format:

examples: xacml-v3.0-wd-03.xsd

When the "." + "form" is dropped, to meet the
ArtifactIdentifier format pattern, we are
left with one identifier (xacml-v3.0-wd-03)
that matches four different artifacts instantiated
(quite differently) in four different files.  Hmmm...

[594-596] Namespace URIs

ASIS: "OASIS namespace declarations pursuant to [XML NS 1.1]
or [XML NS 1.0] MAY be defined as URIs using the http
scheme as an alternative to the URN form defined in
Section 6.1.

That sshould be "7.1". In view of the complexities involved in the
use of URNs (no common resolution mechanisms), we think
namespaces should be defined as URIs and generally should be
DNS/HTTP resolvable.  Why not?  What value is a 404?  In any case,
when an HTTP scheme URI  namespace has been declared by a TC, it
should be reserved for use [by TC Admin] as a location for a
namespace document or the equivalent; no other kind of resource
should be accessible by dereferenceing that URI. Dereferencing the
URI should fetch a RDDL document or similar descriptive resource
informing the reader about the relevant resources.

[631-633] Base Domain For URIs

ASIS: "URIs created for all OASIS artifacts created by
or pertaining to technical committees SHOULD be rooted
at the docs (third-level) domain on the oasis-open.org
Internet domain, thus at the base docs.oasis-open.org."

Change to: URIs created for all OASIS artifacts created by
or pertaining to technical committees SHOULD be rooted
at http://docs.oasis-open.org

No need to refer to "docs" as the third-level domain

[634-637] Technical Committee Tree and related trees

ASIS: "Technical Committee Tree: The short name of the
OASIS technical committee, as established by the TC
Administrator, typically upon initial formation, MUST
be the next node in the URI after the base:

We agree with this scheme as the root for all OASIS
specifications and other approved TC work.  Optionally,
TCs will be allowed to deposit not-yet-approved or
otherwise "not-subject-to-approval process" documents
under the TC's root in various designated
subdirectories.  For specification-related documents,
we propose that the path should be as follows:


This scheme implies that revisions would need to be made
in ASIS at lines 646 ( docs.oasis-open.org/[product] ),
653 ( docs.oasis-open.org/[product]/[profileID] ), and
669 ( docs.oasis-open.org/[product]/[productVersion] )

Most critically, insert [tcShortName] in all cases
after docs.oasis-open.org/ in lines 646, 653, 669.

[638-639 and passim]  Using the .php file extension
ASIS: "An index page MUST be maintained at the default location
(typically docs.oasis-open.org/[tcShortName]/index.php)"

We are not sure what "typically" means (the path? the
filename? both?), but our preference is NOT to use
an index filename with ".php" for what will evidently
be an (X)HTML document; we prefer to keep URIs
free of strings that reflect transient technologies

[656-663] 8.3.3 Non Specification Track Documents

ASIS: "docs.oasis-open.org/[tcShortName]/other"

For non-specification-related documents, other directories
may be defined as appropriate and are not within the scope
of the ASIS document.

[664] TC Admin

ASIS: "Each Product is assigned an identifying name by the TC Process

Please correct to "TC Administrator"

[675-685] Section 8.5 Latest Version Subtree

Providing support for the notion of generic URIs as URLs
for getting the "latest/current" version is one of our highest
priorities: it has been requested by numerous TCs for
different application scenarios, and is recognized as
a common industry practice for standards development.

While we can commit to providing this support, we are not
aware of implementation experience sufficient to demonstrate
the integrity and usability of the exact design proposed
in section 8.5. User requests have shown that there are
multiple kinds of "latest" (latest editors' draft, latest
approved version, latest QC'd version, etc). We think it
is unwise to commit to this novel idea of a "latest" URI
at '/[product]/latest', and recommend omitting this
section from any revised ASIS draft, pending a final decision
based upon input from TC Chairs and others.


Colophon: the document above [presumably] was not checked
thoroughly for cogency, internal consistency [pasted
as a merge from numeous sources] or reasonableness.  Please
discount and ignore any such classes of editing errors,
as solely attributable to 'rcc', with regrets.


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]