OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

regrep message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]

Subject: Classification Schemes


Several of us at NIST have been playing around with different ways to
define and exchange classification schemes for the OASIS Registry/Repository. 

Our assumptions are:

1) Every classification scheme is registered as a Registry Item. Thus it
has a global uniform resource name (URN) assigned by the registration
authority. That name can be used to access all of the scheme's metadata,
e.g. aliases, descriptions, dependencies, etc.

2) We need a DTD to represent the classification scheme itself. An OASIS
classification scheme representation must validate to this DTD.

3) The DTD must allow an arbitrary number of levels in the scheme hierarchy.

4) The classification scheme definer must have the freedom to decide if the
classification scheme will be: i) simple 1-level, 2) multi-level coded, 3)
multi-level named, or some combination.  The DTD must support any of these

We think we've come up with an XML DTD that satisfies the above.  It is
relatively simple and consists of a nested hierarchy of nodes, together
with optional names and codes to identify the levels of the hierarchy.  The
structure of an XML document that validates to this DTD will determine the
hierarchy of the classification scheme.

Attached are 5 documents as follows:

1) A 1-page PDF ER-diagram showing the relationships among registry items,
classification schemes, and classifications.  Recall that a registry item
cannot be classified according to a classification scheme unless that
scheme is itself registered.

2) A 1-page ClassificationScheme DTD.  It can be viewed as a text document
or processed as a "dtd" document by XML software.

Some semantic interpretations are necessary.  The optional level names for
a classification scheme are defined separately from the hierarchical
structure.  They must be consistent, e.g. the names must be presented in
the correct order and match the number of levels in the hierarchy. An
alternative would be to carry along the level names with every item
(posible with the IMPLICT attributes), but that becomes onerous.

Any document that validates to this DTD can be parsed to populate the
entities in the information model portrayed by the PDF ER-diagram. In some
cases, default level codes will be created by the registration authority.
ItemId's for the classification scheme items will always be
registry-specific and hidden from naive users. 

For naive users, a classification scheme item can be identified by a triple
(SchemeURN, LevelCode, ItemValue). Sometimes several items are necessary to
completely identify a classification. For example the Genus/Species
classification of trees could be achieved by 2 triples:
               (TreeType, Genus, GenusValue)
               (TreeType, Species, SpeciesValue)

3) Example 1 -- a simple 1-level classification of student status.
   Example 2 -- a multi-level coded classification of MidAtlantic watersheds.
   Example 3 -- a multi-level named classification of timeperiods for
Each example can be viewed as a text document or processed as an "xml"
document by XML software.

We'd be interested in any comments -- in the absence of comments this
ER-diagram and the ClassificationScheme DTD will appear in the next
candidate Registry/Repository specification.

Len Gallagher


<!ELEMENT ClassificationScheme
<!ATTLIST ClassificationScheme schemeURN CDATA #REQUIRED>

<!ELEMENT classification-scheme-level  (comment-text)>
<!ATTLIST classification-scheme-level     
	levelCode  CDATA  #REQUIRED
        levelName  CDATA  #IMPLIED
        levelNbr   CDATA  #IMPLIED

<!ELEMENT classification-scheme-node  

<!ELEMENT classification-scheme-item  (comment-text)>
<!ATTLIST classification-scheme-item     
	itemValue  CDATA  #REQUIRED
        itemName   CDATA  #IMPLIED
        levelNbr   CDATA  #IMPLIED
        levelCode  CDATA  #IMPLIED
        levelName  CDATA  #IMPLIED

<!ELEMENT comment-text (#PCDATA)>




Len Gallagher                             LGallagher@nist.gov
NIST                                      Work: 301-975-3251
Bldg 820  Room 562                        Home: 301-424-1928
Gaithersburg, MD 20899-8970 USA           Fax: 301-948-6213

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]

Powered by eList eXpress LLC