[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Subject: RE: Discussion of getPath() method
The only comment I have on all of this is that I think getPath() should return UNSPC:2001/32/11/18. This way we get a complete path from the call as well as the ability to determine which version of the classification scheme is in use. Matthew MacKenzie XML Global Technologies, Inc. -----Original Message----- From: Len Gallagher [mailto:LGallagher@nist.gov] Sent: September 25, 2001 6:54 AM To: Farrukh Najmi Cc: Len Gallagher; regrep-query@lists.oasis-open.org Subject: Re: Discussion of getPath() method We need other opinions here! I'm concerned about pre-pending the "name" or the "id" to the string one gets from the getPath() method. In the first example, Farrukh prepends the name. But in the second example he prepends the id. There is no guarantee that the "name" of a classification scheme is a unique identifier for it. In particular UNSPSC could refer to the 2001 version or the upcoming 200x version, or some completely unrelated scheme. If we must prepend one of these items it must be the "id". But prepending the "id" is also unsatisfactory to me because the "id" may be a 128 bit UUID that is completely meaningless to a human. No one would use such a system that required such id's to be a visible part of a human readable path. I'm also a bit concerned that the path for UNSPSC code "321118" would be a forced splitting up of the code into 32/11/18. However, I can live with that if there is some other way for a user to ask for all items that are classified by "321118" or some sub-classification of "321118". It doesn't work to ask the user to split up the code into its constituent parts because many users, and many software clients, won't have any idea how to do that, especially if the classification is an external classification like the Library example referenced below. Even the U.S./Canada NAICS classification scheme uses a different number of digits for each level and most client software systems won't know how many digits to allow for each level. Now comes the commercial!! If our XML syntax for submitting or querying a Classification instance were extended so that either "code" or "path" (or both!) could be used as part of the classification, then I think we could avoid the problem raised above. I think it is possible to modify our existing XML specification for Classification, in an upward compatible way, to do just that! But first we need to agree on what the getPath() method returns for known classification schemes. -- Len At 08:34 PM 9/24/01, Farrukh Najmi wrote: >This is a very good discussion. Please see my responses inline. > >Len Gallagher wrote: > > > Registry Query team, > > > > During last Friday's teleconference we discussed the getPath() method > > defined for the ClassificationNode class in ebRIM (cf Section 10.2.4 > page 38). > > > > At present this method is only superficially specified in ebRIM. I think > > the confusion we're all having in trying to understand what one another is > > saying is a direct result of the lack of specification for the getPath() > > method. Can we have a discussion in the Query team as to what we think > > should get returned by getPath()? > > > > Consider a few examples: > > > > 1) ClassificationScheme > > id="urn:org:un:spsc:cs2001" > > name="UNSPSC" > > > > ClassificationNode > > id="UUID1" > > name="Electronic Components and Supplies" > > code="32" > > parent=??? > > > > ClassificationNode > > id=UUID2" > > name="Diodes and transistors and semiconductor devices" > > code="11" > > parent="UUID1" > > > > ClassificationNode > > id=UUID3 > > name="Integrated circuit components" > > code="18" > > parent="UUID2" > > > > What string value should getPath() applied to node UUID3 return? > > > > a) "urn:org:un:spsc:cs2001/321118" > > > > b) "urn:org:un:spsc:cs2001/32/11/18" > > > > c) "UNSPSC/32/11/18" > > > > d) "UNSPSC/321118" > > > > e) "321118" > > > > f) "32/11/18" > > > > g) "UNSPSC/Electronic Components and Supplies/Diodes and transistors > > and semiconductor devices/Integrated circuit components" > >According to the algorithm I described in: > > http://lists.oasis-open.org/archives/regrep-query/200109/msg00047.html > >The answer is (c) > > > > > > > I don't think there's any value in trying to carry along the names for the > > classification scheme or the names of the various nodes in its hierarchy. > > There's just too much chance for error. So I think we should concentrate on > > id's and/or codes. That would eliminate choices c), d) and g). Next, I > > think we should make a clear distinction between the classification scheme > > itself and the nodes in its hierarchy. I see no value in a) or b) since we > > can use separate methods to get at that scheme information. That leaves e) > > or f). My vote goes for e), because f) would require people to remember > > how many digits are in each level of the path and sometimes (e.g. NAICS) > > that varies. > > > > CONCLUSION: For multi-level coded classification schemes, i.e. > > classifications schemes for which each node's "code" is an embedded > > representation of the path leading to that node, getPath() should return > > just the "code" for that node. > >First I believe that the term multi-level coded classification scheme is a >misnomer here. >I beleive you are looking for a term to describe schemes that embed the >path of a >node in the nodes >code. > >Not sure why you say it is wrong to carry the name of the scheme in the >path. I >agree we should not >carry the name of the node in the path. > >I believe that our spec should be blind about any meaning implied in the >code for >a scheme and simply follow the >algorithm I described. > > > > > > > 2) ClassificationScheme > > id="urn:ebxml:trees:v1" > > name="Modern Day Tree Types" > > description="This scheme defines the Genus and Species of modern day > > trees" > > > > ClassificationNode > > id="UUID4" > > name="Acer" > > code="Acer" > > parent=??? > > description="<enUS> Genus name for any maple tree" > > > > ClassificationNode > > id=UUID5" > > name="barbatum" > > code="barbatum" > > parent="UUID4" > > description="<enUS> Species name for Southern Sugar Maple" > > > > What string value should getPath() applied to node UUID5 return? > > > > a) "Modern Day Tree Types/Acer/barbatum" > > > > b) "urn:ebxml:trees:v1/Acer/barbatum" > > > > c) "Acer/barbatum" > > > > d) "Genus:Acer/Species:barbatum" > > > > For the same reasons as above, I think we should rule out a) and b). I > > don't like d) so much because I don't think we should mix level names with > > the path leading to a node. Instead, if level names are important, we > > should extend our model to allow the user to define level names, with a new > > method on ClassificationScheme to getClassificationLevels() and a new > > method on ClassificationNode to getLevelName(). I think c) is the proper > > result for getPath(). > > > > CONCLUSION: For a general purpose multi-level classification scheme, where > > it is not known whether or not the code attribute for ClassificationNode > > carries an embedded path representation, getPath() should return a sequence > > of codes from the first to last levels of the classification scheme. Each > > code should be separated from the others by a "/". > >I believe the path must include the scheme name in order for it to be >absolute. >This is similar to >how the root directory plays a role in file paths in a file system. > >So according to the algorithm I proposed the correct answre would be (b) > > > > > > > c) Any 1-level Enumeration Classification Scheme > > > > CONCLUSION: For any node N in a 1-level classification scheme, e.g. all > > enumeration domains, the getPath() method should return a value equal to > > the "code" attribute for that node. > >Again the scheme name must be part of the path so it would be: > >/schemeName/codeAttributeForNode > >according to the proposed algorithm > > > > > > > d) Consider the Library Classification Scheme discussed in a previous email > > message. > > > > > http://lists.oasis-open.org/archives/regrep-ex-scheme/200109/msg00004.ht ml > > > > This example defines a multi-level external classification scheme. > > > > CONCLUSION: For external classifications, the submitter of the > > classification should be allowed to provide as much information as possible > > to help the Registry determine what is the intended "code" and "path" and > > "pathDepth" of each node referenced by the Classification instance. For > > example, the submitter should be allowed to say that the path for the > > classification of a book is "TA357.5", since that is the preferred > > embedding for the entire path of the node. But the submitter should also be > > allowed to submit a pathDepth value of 3, or a pathRepresentation like > > "TA/357/5", so that the Registry can support queries over the separate > levels. > >External classifications are the only kind of classification that UDDI >does. As >such UDDI has honed it down >reasonably well. If you study what UDDI does in this area they have no >notion of >pathDepth etc. All >they have is the notion of keyed reference which is a tuple consisting of a >scheme, keyName and a keyValue. > >This is exactly what I have proposed in the external classification >proposal that >is being considered within that sub-team. > >I am not convinced of any use case that pathDepth can address that are not >addressed by the examples I gave >in: > > http://lists.oasis-open.org/archives/regrep-query/200109/msg00047.html > > > > > > > > Any other opinions? > > > > -- Len > > > > ************************************************************** > > Len Gallagher LGallagher@nist.gov > > NIST Work: 301-975-3251 > > Bldg 820 Room 562 Home: 301-424-1928 > > Gaithersburg, MD 20899-8970 USA Fax: 301-948-6213 > > ************************************************************** > > > > ---------------------------------------------------------------- > > To subscribe or unsubscribe from this elist use the subscription > > manager: <http://lists.oasis-open.org/ob/adm.pl> > >-- >Regards, >Farrukh > ************************************************************** Len Gallagher LGallagher@nist.gov NIST Work: 301-975-3251 Bldg 820 Room 562 Home: 301-424-1928 Gaithersburg, MD 20899-8970 USA Fax: 301-948-6213 ************************************************************** ---------------------------------------------------------------- To subscribe or unsubscribe from this elist use the subscription manager: <http://lists.oasis-open.org/ob/adm.pl>
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Powered by eList eXpress LLC