[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Subject: [regrep] Best Practice needed for Classification.nodeRepresentation
Regrep members, As many of you know, NIST is trying to develop a test suite for Registries claiming conformance to the Registry Information Model (RIM) and Registry Services (RS), version 2.0. As part of our analysis of RIM, we've found that the meaning of the "nodeRepresentation" attribute of Classification, RIM Section 10.3.5, is not sufficiently precise to ensure interoperability among conforming Registry implementations. The text of that section, lines 1386-1391, is as follows: "If the Classification instance represents an external classification, then the nodeRepresentation attribute is required. It is a representation of a taxonomy element from a classification scheme. It is the responsibility of the registry to distinguish between different types of nodeRepresentation, like between the classification scheme node code and the classification scheme node canonical path. This allows client to transparently use different syntaxes for nodeRepresentation." I propose a best practices implementor's agreement that will allow client software to use the nodeRepresentation element on a SubmitObjectsRequest in a manner that will always be interpreted the same way by conforming implementations. Since every Classification instance is dependent upon some ClassificationScheme instance, we can use information in that ClassificationScheme instance to guarantee a proper interpretation of values in the "nodeRepresentation" attribute. Note that the "nodeType" attribute is required in every ClassificationScheme instance and in Section 10.1.3 RIM defines the meaning of each of the following values: UniqueCode, EmbeddedPath, NonUniqueCode. In the following rules, I propose a way to handle different forms of the "nodeRepresentation" attribute in Classification instances. 1) If the classification scheme has "nodeType" equal to "UniqueCode" or "EmbeddedPath", then a "nodeRepresentation" value will be interpreted to be identical to the value of the "code" attribute of the intended ClassificationNode instance. 2) If the classification scheme has "nodeType" equal to "NonUniqueCode", and if a "nodeRepresentation" value begins with a backslash, i.e. "/", then the "nodeRepresentation" value will be interpreted as identical to the backslash-separated concatenation of the "code" attribute values of all classification nodes in the path that uniquely determines a node instance of that classification scheme. 3) If a Classification instance has a non-null "nodeRepresentation" attribute, then it must also have a non-null classificationScheme attribute that references a ClassificationScheme instance. These rules leave open the possibility of adding additional "nodeTypes" in subsequent versions of RIM, or of adding additional interpretations of "nodeRepresentation" for a classification scheme having nodeType NonUniqueCode. For example, on submission of a Classification instance that references the UNSPSC classification scheme, which has nodeType "EmbeddedPath", the nodeRepresentation value would be the 2-to-8-digit code that identifies a unique node of that scheme, e.g. "dd", "dddd", "dddddd", or "dddddddd" depending on the level of the node in the classification scheme. Similarly, if the Classification instance references a Geographic classification scheme of nodeType "NonUniqueCode" that is a two-level scheme that identifies Continent and Country, then "nodeRepresenation" having string value "/Asia/Turkey" would be interpreted as a code value of "Asia" at the first level node and a code value of "Turkey" at the second level node. NOTE: Later versions of RS may allow different XML representations of a nodeRepresentation, but in RS v2.0 it is simply an attribute having a string value of type LongName. Thus more sophisticated solutions to multiple code node representations will require extensions to RS. NOTE: I'm proposing that NonUniqueCode nodeRepresentations begin with a backslash so that the concatenation of the "classificationScheme" attribute with the "nodeRepresentation" attribute be identical to the Canonical Path Syntax of a node as described in Section 10.2.5 of RIM and is comparable to the string that gets returned by the getPath() method. In fact, we could add a 4-th rule as follows: 4) An external Classification instance will identify an internal ClassificationNode instance of a NonUniqueCode type classification scheme, if the concatenation of a leading backslash "/" with the values of the "classificationScheme" and "nodeRepresentation" attributes matches the getPath() method applied to that node. One major advantage of the interpretation rules proposed above is that they don't require any changes to the XML elements defined in RS. Another advantage is that they could apply whether the referenced ClassificationScheme instance is "internal" or "external" as defined in RIM Section 10.1. And they offer support to a paragraph in the RIM Section 10.3, Classification, lines 1359-1362, that states: "The attributes and methods for the Classification class are intended to allow for representation of both internal and external classifications in order to minimize the need for a submission or a query to distinguish between internal and external classifications." These rules would allow client systems to use "nodeRepresentation" in a uniform way without having to know if the classification scheme is "internal" or "external" to the Registry they are connected to. Thus client software could classify submitted objects by UNSPSC classifications without having to first query the Registry to determine if UNSPSC was "internal" or "external" and if internal, then querying the Registry to determine the UUID's for the nodes it intends to reference. CAUTION: The down side of this proposal is that it appears to be in conflict with an existing sentence in RIM, Section 10.3, Classification, lines 1354-1357 that states: "An internal classification will always reference the node directly, by its id, while an external classification will reference the node indirectly by specifying a representation of its value that is unique within the external classification scheme." CONCLUSION: My interpretation is that lines 1359-1362 and lines 1354-1357 in RIM v2.0 are in apparent contradiction with one another, so an implementor, or a testor, is forced to choose between them. However, by adopting the rules proposed above as "Best Practices" or as "Implementor's Agreements", one can allow uniform support for "nodeRepresentation" values in useful situations without making any modifications to the XML specified in RS v2.0. And, one could avoid any modifications to RIM by an interpretation that allows an "external" Classification to properly reference an existing node of an "internal" ClassificationScheme. FINALLY: As part of its development of tests for Registry implementations, NIST is defining SubmitObjectsRequest elements that contain ExtrinsicObject elements and Classification elements that classify the extrinsic objects by well-known classification schemes. In order to avoid the complexities described above for twice querying the Registry before any classifications could be accomplished, and in the absence of any alternative actions by the Registry TC, WE WILL ASSUME implementor agreements that adopt the rules proposed above for interpreting "nodeRepresentation" attribute values in the Classification elements. Sincerely, Len ************************************************************** Len Gallagher LGallagher@nist.gov NIST Work: 301-975-3251 Bldg 820 Room 562 Home: 301-424-1928 Gaithersburg, MD 20899-8970 USA Fax: 301-948-6213 **************************************************************
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Powered by eList eXpress LLC