[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Subject: Classification Schemes
RegRepers, Several of us at NIST have been playing around with different ways to define and exchange classification schemes for the OASIS Registry/Repository. Our assumptions are: 1) Every classification scheme is registered as a Registry Item. Thus it has a global uniform resource name (URN) assigned by the registration authority. That name can be used to access all of the scheme's metadata, e.g. aliases, descriptions, dependencies, etc. 2) We need a DTD to represent the classification scheme itself. An OASIS classification scheme representation must validate to this DTD. 3) The DTD must allow an arbitrary number of levels in the scheme hierarchy. 4) The classification scheme definer must have the freedom to decide if the classification scheme will be: i) simple 1-level, 2) multi-level coded, 3) multi-level named, or some combination. The DTD must support any of these styles. We think we've come up with an XML DTD that satisfies the above. It is relatively simple and consists of a nested hierarchy of nodes, together with optional names and codes to identify the levels of the hierarchy. The structure of an XML document that validates to this DTD will determine the hierarchy of the classification scheme. Attached are 5 documents as follows: 1) A 1-page PDF ER-diagram showing the relationships among registry items, classification schemes, and classifications. Recall that a registry item cannot be classified according to a classification scheme unless that scheme is itself registered. 2) A 1-page ClassificationScheme DTD. It can be viewed as a text document or processed as a "dtd" document by XML software. Some semantic interpretations are necessary. The optional level names for a classification scheme are defined separately from the hierarchical structure. They must be consistent, e.g. the names must be presented in the correct order and match the number of levels in the hierarchy. An alternative would be to carry along the level names with every item (posible with the IMPLICT attributes), but that becomes onerous. Any document that validates to this DTD can be parsed to populate the CLASSIFICATION_SCHEME, CLASSIFICATION_LEVEL, and CLASSIFICATION_ITEM entities in the information model portrayed by the PDF ER-diagram. In some cases, default level codes will be created by the registration authority. ItemId's for the classification scheme items will always be registry-specific and hidden from naive users. For naive users, a classification scheme item can be identified by a triple (SchemeURN, LevelCode, ItemValue). Sometimes several items are necessary to completely identify a classification. For example the Genus/Species classification of trees could be achieved by 2 triples: (TreeType, Genus, GenusValue) (TreeType, Species, SpeciesValue) 3) Example 1 -- a simple 1-level classification of student status. Example 2 -- a multi-level coded classification of MidAtlantic watersheds. Example 3 -- a multi-level named classification of timeperiods for artifacts. Each example can be viewed as a text document or processed as an "xml" document by XML software. We'd be interested in any comments -- in the absence of comments this ER-diagram and the ClassificationScheme DTD will appear in the next candidate Registry/Repository specification. Regards, Len Gallagher
<!ELEMENT ClassificationScheme ( comment-text?, classification-scheme-level*, classification-scheme-node+ ) > <!ATTLIST ClassificationScheme schemeURN CDATA #REQUIRED> <!ELEMENT classification-scheme-level (comment-text)> <!ATTLIST classification-scheme-level levelCode CDATA #REQUIRED levelName CDATA #IMPLIED levelNbr CDATA #IMPLIED > <!ELEMENT classification-scheme-node ( classification-scheme-item, classification-scheme-node* ) > <!ELEMENT classification-scheme-item (comment-text)> <!ATTLIST classification-scheme-item itemValue CDATA #REQUIRED itemName CDATA #IMPLIED levelNbr CDATA #IMPLIED levelCode CDATA #IMPLIED levelName CDATA #IMPLIED > <!ELEMENT comment-text (#PCDATA)>
************************************************************** Len Gallagher LGallagher@nist.gov NIST Work: 301-975-3251 Bldg 820 Room 562 Home: 301-424-1928 Gaithersburg, MD 20899-8970 USA Fax: 301-948-6213 **************************************************************
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Powered by eList eXpress LLC