OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

regrep-query message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]


Subject: RE: Discussion of getPath() method


The only comment I have on all of this is that I think getPath() should
return UNSPC:2001/32/11/18.  This way we get a complete path from the
call as well as the ability to determine which version of the
classification scheme is in use.  


Matthew MacKenzie
XML Global Technologies, Inc.


-----Original Message-----
From: Len Gallagher [mailto:LGallagher@nist.gov] 
Sent: September 25, 2001 6:54 AM
To: Farrukh Najmi
Cc: Len Gallagher; regrep-query@lists.oasis-open.org
Subject: Re: Discussion of getPath() method


We need other opinions here!  I'm concerned about pre-pending the "name"
or 
the "id" to the string one gets from the getPath() method.

In the first example, Farrukh prepends the name.  But in the second
example 
he prepends the id.

There is no guarantee that the "name" of a classification scheme is a 
unique identifier for it. In particular UNSPSC could refer to the 2001 
version or the upcoming 200x version, or some completely unrelated
scheme. 
If we must prepend one of these items it must be the "id".

But prepending the "id" is also unsatisfactory to me because the "id"
may 
be a 128 bit UUID that is completely meaningless to a human. No one
would 
use such a system that required such id's to be a visible part of a
human 
readable path.

I'm also a bit concerned that the path for UNSPSC code "321118" would be
a 
forced splitting up of the code into 32/11/18.  However, I can live with

that if there is some other way for a user to ask for all items that are

classified by "321118" or some sub-classification of "321118". It
doesn't 
work to ask the user to split up the code into its constituent parts 
because many users, and many software clients, won't have any idea how
to 
do that, especially if the classification is an external classification 
like the Library example referenced below. Even the U.S./Canada NAICS 
classification scheme uses a different number of digits for each level
and 
most client software systems won't know how many digits to allow for
each 
level.

Now comes the commercial!!

If our XML syntax for submitting or querying a Classification instance
were 
extended so that either "code" or "path" (or both!) could be used as
part 
of the classification, then I think we could avoid the problem raised 
above. I think it is possible to modify our existing XML specification
for 
Classification, in an upward compatible way, to do just that! But first
we 
need to agree on what the getPath() method returns for known
classification 
schemes.

-- Len


At 08:34 PM 9/24/01, Farrukh Najmi wrote:

>This is a very good discussion. Please see my responses inline.
>
>Len Gallagher wrote:
>
> > Registry Query team,
> >
> > During last Friday's teleconference we discussed the getPath()
method
> > defined for the ClassificationNode class in ebRIM (cf Section 10.2.4

> page 38).
> >
> > At present this method is only superficially specified in ebRIM. I
think
> > the confusion we're all having in trying to understand what one
another is
> > saying is a direct result of the lack of specification for the
getPath()
> > method. Can we have a discussion in the Query team as to what we
think
> > should get returned by getPath()?
> >
> > Consider a few examples:
> >
> > 1) ClassificationScheme
> >       id="urn:org:un:spsc:cs2001"
> >       name="UNSPSC"
> >
> >     ClassificationNode
> >       id="UUID1"
> >       name="Electronic Components and Supplies"
> >       code="32"
> >       parent=???
> >
> >     ClassificationNode
> >       id=UUID2"
> >       name="Diodes and transistors and semiconductor devices"
> >       code="11"
> >       parent="UUID1"
> >
> >     ClassificationNode
> >       id=UUID3
> >       name="Integrated circuit components"
> >       code="18"
> >       parent="UUID2"
> >
> >   What string value should getPath() applied to node UUID3 return?
> >
> >    a) "urn:org:un:spsc:cs2001/321118"
> >
> >    b) "urn:org:un:spsc:cs2001/32/11/18"
> >
> >    c) "UNSPSC/32/11/18"
> >
> >    d) "UNSPSC/321118"
> >
> >    e) "321118"
> >
> >    f) "32/11/18"
> >
> >    g) "UNSPSC/Electronic Components and Supplies/Diodes and
transistors
> >                and semiconductor devices/Integrated circuit
components"
>
>According to the algorithm I described in:
>
>
http://lists.oasis-open.org/archives/regrep-query/200109/msg00047.html
>
>The answer is (c)
>
> >
> >
> > I don't think there's any value in trying to carry along the names
for the
> > classification scheme or the names of the various nodes in its
hierarchy.
> > There's just too much chance for error. So I think we should
concentrate on
> > id's and/or codes.  That would eliminate choices c), d) and g).
Next, I
> > think we should make a clear distinction between the classification
scheme
> > itself and the nodes in its hierarchy. I see no value in a) or b)
since we
> > can use separate methods to get at that scheme information. That
leaves e)
> > or f).  My vote goes for e), because f) would require people to
remember
> > how many digits are in each level of the path and sometimes (e.g.
NAICS)
> > that varies.
> >
> > CONCLUSION: For multi-level coded classification schemes, i.e.
> > classifications schemes for which each node's "code" is an embedded
> > representation of the path leading to that node, getPath() should
return
> > just the "code" for that node.
>
>First I believe that the term multi-level coded classification scheme
is a
>misnomer here.
>I beleive you are looking for a term to describe schemes that embed the

>path of a
>node in the nodes
>code.
>
>Not sure why you say it is wrong to carry the name of the scheme in the

>path. I
>agree we should not
>carry the name of the node in the path.
>
>I believe that our spec should be blind about any meaning implied in
the 
>code for
>a scheme and simply follow the
>algorithm I described.
>
> >
> >
> > 2) ClassificationScheme
> >       id="urn:ebxml:trees:v1"
> >       name="Modern Day Tree Types"
> >       description="This scheme defines the Genus and Species of
modern day
> > trees"
> >
> >     ClassificationNode
> >       id="UUID4"
> >       name="Acer"
> >       code="Acer"
> >       parent=???
> >       description="<enUS> Genus name for any maple tree"
> >
> >     ClassificationNode
> >       id=UUID5"
> >       name="barbatum"
> >       code="barbatum"
> >       parent="UUID4"
> >       description="<enUS> Species name for Southern Sugar Maple"
> >
> >   What string value should getPath() applied to node UUID5 return?
> >
> >    a) "Modern Day Tree Types/Acer/barbatum"
> >
> >    b) "urn:ebxml:trees:v1/Acer/barbatum"
> >
> >    c) "Acer/barbatum"
> >
> >    d) "Genus:Acer/Species:barbatum"
> >
> > For the same reasons as above, I think we should rule out a) and b).
I
> > don't like d) so much because I don't think we should mix level
names with
> > the path leading to a node. Instead, if level names are important,
we
> > should extend our model to allow the user to define level names,
with a new
> > method on ClassificationScheme to getClassificationLevels() and a
new
> > method on ClassificationNode to getLevelName(). I think c) is the
proper
> > result for getPath().
> >
> > CONCLUSION: For a general purpose multi-level classification scheme,
where
> > it is not known whether or not the code attribute for
ClassificationNode
> > carries an embedded path representation, getPath() should return a
sequence
> > of codes from the first to last levels of the classification scheme.
Each
> > code should be separated from the others by a "/".
>
>I believe the path must include the scheme name in order for it to be 
>absolute.
>This is similar to
>how the root directory plays a role in file paths in a file system.
>
>So according to the algorithm I proposed the correct answre would be
(b)
>
> >
> >
> > c) Any 1-level Enumeration Classification Scheme
> >
> > CONCLUSION: For any node N in a 1-level classification scheme, e.g.
all
> > enumeration domains, the getPath() method should return a value
equal to
> > the "code" attribute for that node.
>
>Again the scheme name must be part of the path so it would be:
>
>/schemeName/codeAttributeForNode
>
>according to the proposed algorithm
>
> >
> >
> > d) Consider the Library Classification Scheme discussed in a
previous email
> > message.
> >
> > 
>
http://lists.oasis-open.org/archives/regrep-ex-scheme/200109/msg00004.ht
ml
> >
> > This example defines a multi-level external classification scheme.
> >
> > CONCLUSION: For external classifications, the submitter of the
> > classification should be allowed to provide as much information as
possible
> > to help the Registry determine what is the intended "code" and
"path" and
> > "pathDepth" of each node referenced by the Classification instance.
For
> > example, the submitter should be allowed to say that the path for
the
> > classification of a book is "TA357.5", since that is the preferred
> > embedding for the entire path of the node. But the submitter should
also be
> > allowed to submit a pathDepth value of 3, or a pathRepresentation
like
> > "TA/357/5", so that the Registry can support queries over the
separate 
> levels.
>
>External classifications are the only kind of classification that UDDI 
>does. As
>such UDDI has honed it down
>reasonably well. If you study what UDDI does in this area they have no 
>notion of
>pathDepth etc. All
>they have is the notion of keyed reference which is a tuple consisting
of a
>scheme, keyName and a keyValue.
>
>This is exactly what I have proposed in the external classification 
>proposal that
>is being considered within that sub-team.
>
>I am not convinced of any use case that pathDepth can address that are
not
>addressed by the examples I gave
>in:
>
>
http://lists.oasis-open.org/archives/regrep-query/200109/msg00047.html
>
>
> >
> >
> > Any other opinions?
> >
> > -- Len
> >
> > **************************************************************
> > Len Gallagher                             LGallagher@nist.gov
> > NIST                                      Work: 301-975-3251
> > Bldg 820  Room 562                        Home: 301-424-1928
> > Gaithersburg, MD 20899-8970 USA           Fax: 301-948-6213
> > **************************************************************
> >
> > ----------------------------------------------------------------
> > To subscribe or unsubscribe from this elist use the subscription
> > manager: <http://lists.oasis-open.org/ob/adm.pl>
>
>--
>Regards,
>Farrukh
>

**************************************************************
Len Gallagher                             LGallagher@nist.gov
NIST                                      Work: 301-975-3251
Bldg 820  Room 562                        Home: 301-424-1928
Gaithersburg, MD 20899-8970 USA           Fax: 301-948-6213
**************************************************************


----------------------------------------------------------------
To subscribe or unsubscribe from this elist use the subscription
manager: <http://lists.oasis-open.org/ob/adm.pl>



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]


Powered by eList eXpress LLC