OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

regrep message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]


Subject: Re: Need a Related Data element



RegRep'ers,

I've been reading the discussions resulting from Terry's original message
on March 16, titled "Need a Related Data Element" and all of the follow-on
discussions involving Related Data, Associations, and Data Dictionaries.
In general, I'm relatively satisfied with the existing OASIS specification
and don't see a critical need to add more metadata for Related Data as
Terry originally suggests, or to drop the notion of Data Dictionary as Una
suggests, although I prefer to think in terms of "packages" rather than
data dictionaries.  No matter what we call it, I think we'll want to allow
submission of a whole package of elements to be registered, and the package
itself will have metadata and should also be registered.  In fact, this can
be recursive, e.g. my-package uses your-package and his-package, as a
basis. When I register my-package as a <data-element-dictionary>, with its
several new <data-element>s, I just need to set up a "uses" association
between my-package and your-package, and my-package and his-package; I
don't need to separately reference all of the individual items within
your-package and his-package. 

I've attached a one-page PDF file as a graphic that helps me think through
these things.  It's a UML diagram presented in terms of the main concepts
from the OASIS specification, i.e. REGISTERED_ITEM, its multiple
NAME_CONTEXTS, its multiple CLASSIFICATIONS, its multiple
RELATED_DATA_ITEMS, its multiple ASSOCIATIONS, and its parent Data
Dictionary if it was submitted as part of a <data-element-dictionary> package.

The OASIS DTD specifications allow three different ways to represent
relationships among objects:

1) Associations, where each associated item is tagged with one of the tags
from the <data-element-association-list>, i.e.: distributed-within,
superseded-by, supersedes, used-by, uses-exterior, uses-interior.

2) Related Data relationships, where each related data item is tagged with
one of the tags from the <related-entity-list>, i.e: changelog,
cover-letter, distribution-home-page, documentation-set,
documentation-set-information, dssl-style-sheet,
dssl-style-sheet-information, email-discussion-list-information, example,
example-set, example-set-information, faq, other, public-text, readme,
reference-manual, registration-information, related-data-group,
schema-home-page, sgml-declaration, sgml-open-catalogue,
style-sheet-information, tool-information, user-guide, white-paper,
xsl-style-sheet, xsl-style-sheet-information.

3) Data Dictionary relationship, where each item is physically submitted as
part of a <data-element-dictionary> package.

Like Una, I've put some of my own interpretations into these possible
relationships. The specification needs to be clearer than it now is as to
how things should be interpreted.  I'd call these clarifications "semantic
rules". For what its worth, here's how I've been interpreting these three
kinds of relationships:

1) An Association is a very important relationship.  In some sense a
registered object is not really complete unless all of its associated items
are available to it.  Thus an associated item must be present in the same
registry/repository as the given item, or it must be retrievable from some
other publicly available repository.  The relationship goes two ways and it
is important that the repository be able retrieve items in either direction.

To enforce this we may need stricter rules as to what constitutes a valid
<uri-reference> component of a <data-element-association>. I'd say that the
<uri-reference> must be required (it is) and that it must referece one of:
i) another <data-element> in the same submittal package, or ii) another
<data-element> previously registered in the same registry/repository, or
iii) another <data-element> previously registered in some other external
registry/repository conforming to the OASIS specification.  In all three
cases, the associated <data-element> will have a URN that can be stored in
the required AssocItemRef field of the ASSOCIATION class (see figure).  In
addition, cases i) and ii) will yield a local, non-public registry
identifier that can be stored in the AssocItemId field of the ASSOCIATION
class.  No additional metadata is needed, because in all three cases the
associated item will be registered in some publicly available,
OASIS-conformant registry; thus it's complete OASIS metadata will be
available from that source.

The biggest advantage of the ASSOCIATION class is that is can be queried,
recursively, to obtain information not known when an item was first
registered. In particular, when an item is registered, it is not known when
or if it will ever be superseded, or if other items will use it.  But if
the item that supersedes it, or uses it, has a supersedes, or uses,
relationship with the already registered item, then this will be recorded
as an instance of the ASSOCIATION class. Then one can determine whether a
registered item is superseded, or used, without having to do an extensive
search of the entire repository; it is sufficient to search the instances
of the ASSOCIATION class to find all GivenItems that "supersede" or "use" a
given AssociatedItem.  Note: From a repository point of view, the
superseded-by and used-by tags are superfluous, but they will probably be
useful when interchanging registered items among different registries.

2) A Related Data relationship is less important than an Association
relationship.  A <data-element> will be usable even if the related data
item is not available to it. For such items, it is not necessary to try to
maintain the two-way relationships implied by the Associations. Some of the
related data items may be registered in some registry/repository, but many
(perhaps most) will not.  In most cases it is sufficient to have a URL or
URN that allows it to be located if desired. Other than being tagged with
one of the items in the <related-entity-list>, I don't think its necessary
to have any additional metadata for related data.  However, in some cases
the related data item will itself be an XML item, and could be registered
in the same or in a different registry/repository and have the usual
metadata associated with it; in those cases, a simple reference to it will
yield the desired metadata.

To enforce this interpretation we may need stricter rules as to what
constitutes a valid <uri-reference> component of a
<related-data-reference>. I'd say that the <uri-reference> must be required
(it is) and that it must referece one of: i) another <data-element> in the
same submittal package, or ii) another <data-element> previously registered
in the same registry/repository, or iii) another <data-element> previously
registered in some other external registry/repository, or iv) an item not
registered anywhere.  In cases i) through iii), the related data item will
have a URN that can be stored in the required RelatedItem field of the
RELATED_DATA_ITEMS class (see figure). In case iv) the <uri-reference>
should probably be a URL to locate the item directly, although a URN might
be sufficient to locate it indirectly. It might be useful to distinguish
cases i) and ii) from case iii), so I wouldn't object to a new, optional
attribute being added to <related-data-reference> to indicate whether the
related item is "local" or "external" (see below for a first attemp at this).

Terry seems to think that case iv) requires some additional metadata to be
stored in the local data registry.  I think that would unnecessarily
complicate the registry, because it would require new data structures to
hold the metadata.  Instead, we could add really important non-XML items,
e.g. documentation, to the list of items that could be registered in an
OASIS registry.  Note: The metadata for a registered item must conform to
an XML DTD but the item itself need not be one of the items in the
<xsgml-entity-list> - is it heresy to say that?  If not, I'd propose that
we add "Other" as an option in the <xsgml-entity-list>.

3) Data Dictionaries are a special kind of Packaging relationship.  One
could argue that the "distributed-within" tag of the
<data-element-association-list> (see above) is sufficient to represent a
whole hierarchy of packages - and it probably is! Then it wouldn't be
necessary to maintain the InDictionary relationship as is now done with the
DictionaryId field of the REGISTRY_ITEM class.  However, the containment
relationship is really one-to-many (i.e. an item is distributed within only
one package in the repository although it may be referenced, i.e. "used",
by many packages), I'd prefer to delete "distributed-within" as an option
from the GivenItemRole of the ASSOCIATION class and let the physical
containment hierarchy be represented by the InDictionary relationship on
REGISTRY_ITEM.  I could go either way on this if others feel the
"distributed-within" tag is better; but then why don't we have a
"contains-within" tag to go with it?

Registered <data-elements> can be distinguished from registered
<data-element-dictionary>s as follows

Conclusion

Based on the above here are my candidate semantic rules for dealing with
Association and Related Data references:

OASIS Semantic Rules for <uri-reference>

1) If a <uri-reference> is a component of a <data-element-association>, then

Case:

a) If the <data-element-association-type> attribute of the
<data-element-association> is "uses-interior" then the <uri-reference>
identifies a <data-element> in the containing <data-element-dictionary>.

b) If the <data-element-association-type> attribute of the
<data-element-association> is "supersedes" then the <uri-reference> is a
Universal Resource Name that identifies a registered item in the same
registry/repository to which the submission is made, i.e. it is the AltName
of some unique instance of the NAME_CONTEXTS class.

c) If the <data-element-association-type> attribute of the
<data-element-association> is "distributed-within" then ... ? [Note: UNDER
WHAT CIRCUMSTANES DOES THIS MAKE SENSE? MUST IT ALWAYS BE THE CONTAINING
<data-element-dictionary>? Or can we later add new <data-elements> to a
<data-element-dictionary> that was registered previously? This is one
reason I'd prefer to drop "distributed-within" as one of the alternatives
for associations and let the Registration Authority maintain this physical
containment separately in the REGISTRY_ITEM class. Later we can consider
the more difficult topic of how to amend previously registered items or
packages of items.]

d) Otherwise, the <uri-reference> is a Universal Resource Name that
identifies a registered item in some OASIS-conformant registry/repository.


2) If a <uri-reference> is a component of a <related-data-reference>, then

Case:

a) If the <tobe-proposed-location-designator> attribute of the
<related-data-reference> is "this-submission" then the <uri-reference>
identifies a <data-element> in the containing <data-element-dictionary>.

b) If the <tobe-proposed-location-designator> attribute of the
<related-data-reference> is "this-registry" then the <uri-reference> is a
Universal Resource Name that identifies a registered item in the same
registry/repository to which the submission is made, i.e. it is the AltName
of some unique instance of the NAME_CONTEXTS class.

c) If the <tobe-proposed-location-designator> attribute of the
<related-data-reference> is "url" then the <uri-reference> is a Universal
Resource Locator that locates the related data item.

c) If the <tobe-proposed-location-designator> attribute of the
<related-data-reference> is "external-urn" then the <uri-reference> is a
Universal Resource Name that identifies a registered item in some
OASIS-conformant registry/repository.

d) Otherwise, the <uri-reference> is an arbitrary string with no special
semantics.

Rose Diagram(s).pdf

**************************************************************
Len Gallagher                             LGallagher@nist.gov
NIST                                      Work: 301-975-3251
Bldg 820  Room 562                        Home: 301-424-1928
Gaithersburg, MD 20899-8970 USA           Fax: 301-948-6213
**************************************************************


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]


Powered by eList eXpress LLC