OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

plcs-dex message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: SV: SV: SV: FW: [plcs-dex] Unique constraints -> identification and versioning


Hi Peter,

Actually, no I didn't forget any of that. I think there are very simple 
solutions to overloaded terms (e.g. use "Storage Tank" instead of "Tank"). I 
also didn't forget that the Semantic Web ontologies use natural language 
terms in almost all cases. As far as the full URI, that's what XML namespaces 
help with for humans so xxx:SerialNumber is what we see. I've also argued 
that, contrary to what Mats has suggested, using random numbers is actually 
less efficient.

The question I've raised on External_class_library is that I think there's a 
problem in that there's currently nothing in the DEXs that specifies what 
the "context ontology" for the data exchange actually is. Without that, 
there's no way to *validate* that an exchange file contains valid instances 
of External_class. It's as if you were parsing a Part 21 file, found an 
instance named #45=WIDGET('asdf'); but because you don't know the context 
schema you don't know if it's valid. You might know it's an entity type in an 
Integrated Resource but you don't know if it's valid in AP299 because you 
weren't told whether AP299 or AP288 was the context for the exchange. That 
help?

I'm suggesting that External_class_library is the best thing we have in AP239 
for identifying the context ontology and that what's happening is that all 
External_classes are actually put into that context via the OWL import 
statements. You've got to remember that OWL isn't like EXPRESS so you can add 
to a definition of a Class in other using classes (e.g. you can add new 
supertypes).

I'm also suggesting that because of the way RDF and URIs work, the identifier 
of an OWL class is not "SerialNumber" but is its full URI and so that is what 
should appear in the External_class.id attribute.

EXAMPLE

Contractors cannot change the US DOD DODAF Ontology and there's a class there 
with URI http://www.dod.mil/dodaf/sv1/WeaponsSystem. In the current approach 
that's:

External_class_library http://www.dod.mil/dodaf/sv1/ and External_class 
WeaponsSystem - but there's a problem lurking here.

The statement that for Project X WeaponsSystem's a subclass of Product but for 
Project Y WeaponsSystem's a subclass of System_breakdown, but not Product, 
can't be defined in http://www.dod.mil/dodaf/sv1/ or anything it imports, 
it's specified in statements defined in the  http://www.projectx.org and 
http://www.projecty.org ontologies which import it and PLCS. So the semantics 
of the use of http://www.dod.mil/dodaf/sv1/WeaponsSystem is unknown unless 
you know the exchange is in the context of the Project X or Project Y 
ontology. My point is that conceptually this is true for every class - even 
those specified in the standard PLCS RD because you don't know if subclasses 
of them have been defined so that the use of the superclass in the PLCS RD is 
not allowed. Specifying one instance of External_class_library that's the 
Project X ontology and using External_class.id as I've suggested solves this 
problem.

I hope this example shows what I've been talking about.

Cheers,
David

P.S. Nice topic for your holiday dinner:-)

On Friday 22 December 2006 08:29, Peter Bergström wrote:
> Dave,
>
> You obviously forgot the previous example 'tank', the main reason for not
> wanting natural language terms as identifiers is because the are not
> unique.
>
> Now you'll say that we need the full context in the identifier as well, but
> what is then the difference to humans between:
> 'urn:plcs:rdl:std:assigning_identifier:identifier_code:product_as_individua
>l_identifier_code:Serial_identifier_code' And
> 'RD023234563'
>
> Not very much in my view... The first one is too long...
>
> You also forgot that choosing a natural language for identifiers that are
> used inside text intended for human consumption only works for text written
> in that language, e.g. English. All other languages still would need to
> have a translation mechanism of some sort (equivalentClass or rdf:label).
> The use of English in schemas is easier to accept as a foreigner, since the
> schema classes are seldom displayed to end users, so they can bee geek
> language. Reference data are in between (sometimes viewed as part of the
> schema, sometimes viewed as text for end users).
>
> So what Mats is proposing (I think) is to solve the problem with uniqueness
> of the ID by assigning meaningless, but within the RDL unique numbers, and
> force everybody to use the translation mechanism to understand what the
> class is, every time. That's why he proposed the free browser, protégé
> would not support this. Or rather, that's why he propose that OASIS PLCS TC
> should at least define the mechanisms of how to do this, and possibly also
> a general tool for doing it. Put the identifier of the class in geek
> language land, and force everybody to use the same mechanism to display the
> correct term to the end user.
>
> I'm still struggling to understand what's wrong with our current
> implementation of OWL in OASIS. It can't be simply that we have divided the
> identifier into two attributes in two separate entities, can it? And its
> not the whole truth that we use several RDL's instead of the most specific
> one (the reason for this IMHO is that if the class is originally defined in
> a RDL that is less specialized, you want to identify it as the source
> class, not as an imported class. If we reference it using the most
> specialized RDL, other applications would not be able to understand it
> without having access to the specialized RDL. We would not be able to have
> company specific RDLs that use classes from OASIS and industry standards as
> well, because they would always be referred to as being part of the company
> specific RDL.
>
> I'm sure the current approach is not the best one, and I agree we need to
> make it better. But we seem to strive at different goals... At least, I
> have not yet understood why OWL does not work with the current approach.
> Can you give me an example?
>
>
> Peter
>
> -----Original Message-----
> From: David Price [mailto:david.price@eurostep.com]
> Sent: den 21 december 2006 13:43
> To: plcs-dex@lists.oasis-open.org
> Subject: Re: SV: SV: SV: FW: [plcs-dex] Unique constraints ->
> identification and versioning
>
> Hi Mats,
>
> This reply is only concerned with the class id being a number vs words.
>
> Is your view that Class ids need to be meaningless based in the question of
> guaranteed uniqueness being simpler, or is there some other rationale?  For
> example, this view seems to similar to how engineering orgs use Part
> Numbers to identify designs in their PDM systems. Do you hope to reuse PDM
> software in your org?
>
> A second, perhaps implied, concern seems to be unhappiness at a choice
> having to be made about a natural language for the class id?  My view is
> that reference data is simply more modeling in the domain covered by the
> PLCS schema and so the use of the same sort of conventions makes things
> simpler and more consistent. I can also see internal implementations where
> the reference data and EXPRESS are processed together to form a complete
> model (perhaps in OWL, perhaps in UML, etc) and consistency is useful in
> that case as well.
>
> On your suggestion that eOTD used numbers so PLCS RD should, I'd suggest
> that this shows a fundamental difference in approach and is indeed at the
> heart of the issue. The eOTD, and other similar standards, do use codes
> rather than words. I contend that all these coding approaches are the
> result of the earlier use of codes within relational databases where
> varying length strings are costly. People have simply taken an
> implementation tradeoff in RDBs and exposed them in the real world.
> However, for the PLCS scenarios we've chose to use a semantic language to
> expose the meaning of the term, and potentially enable reasoners over PLCS
> data, and which is Web-enabled via URIs. Therefore, the rationale for using
> codes doesn't apply in PLCS-land.
>
> My view is that Classes are not like Part designs or RDBs, Classes are part
> of a vocabulary (or ontology or taxonomy whichever term suits you) and so
> using natural language makes more sense. I still can't imagine why anyone
> would prefer "RD0494049404" to "SerialNumber" as part of a URI that
> identifies a class in an ontology.  To me, it's clearly more efficient to
> have a meaningful, if sometimes overloaded term, than a random number
> because people's time and potential errors costs more than any gain in
> computer processing. Maybe I'm missing something though?
>
> On your comments wrt OWL label vs equivalentClass - label is not used
> during reasoning but equivalentClass is, that's why it's useful ... not
> that label isn't useful.
>
>
>
> Cheers,
> David
>
> On Thursday 21 December 2006 07:53, mats.nilsson@fmv.se wrote:
> > Kind of what I suggested in the last of my examples then... ;o) Good we
> > agree!
> >
> >                   -------------------------------------
> > <class>        -> Class id: RD039405951
> >                   Label: BEA serial sumber (en)
> >
> >                   Descriptive text (en);
> > "is a"         -> a
> > <superclass>   -> <RD039405950>
> > <..features..> -> applied to BEA assets according to BEA rules.
> >                   -------------------------------------
> >
> > Regards,
> >   Mats
> >
> >
> >
> > -----Ursprungligt meddelande-----
> > Från: David Price [mailto:david.price@eurostep.com]
> > Skickat: den 20 december 2006 16:48
> > Till: plcs-dex@lists.oasis-open.org
> > Ämne: Re: SV: SV: FW: [plcs-dex] Unique constraints -> identification and
> > versioning
> >
> > Sean makes a very good point. There's a useful convention for defining
> > classes in an ontology used by some of the Oil and Gas folks that makes
> > Sean's comments explicit:
> >
> > A <class> is a <superclass> that <distinguishing features of this
> > particular subclass>.
> >
> > so in my example you'd have:
> >
> > A SerialNumber is an IdentificationCode that is one of a series assigned
> > for identification which varies from its successor or predecessor by a
> > fixed discrete integer value.
> >
> > I thought this was an excellent convention.
> >
> > Cheers,
> > David
> >
> > On Wednesday 20 December 2006 11:58, Barker, Sean (UK) wrote:
> > > Just to add a further strand to this discussion, Aristotle noted that
> > > definition goes by genus and species, that is, that a definition
> > > identifies what class of thing you are defining (genus), and how it
> > > differs from other things in that class (species). This has two
> > > implications.
> > >
> > > Firstly, any single term in a taxonomy is determined by its context,
> > > that is, the full path from the root concept down to the term. In
> > > practice, humans infer the path directly from context, and homonyms do
> > > not cause any particular linguistic community any great problems
> > > (although it is a problem between different communities such as the UK
> > > and the US). In an OWL ontology, this will only cause problems if the
> > > reference to the term is ambiguous because the reference does not
> > > define the full context. (PS Tank is a particularly bad example to
> > > choose for homonyms - the term was originally a cover word from the
> > > 1914-18 war to fool the Germans that water tanks not AFVs were being
> > > delivered to the front line.)
> > >
> > > Secondly, and embarrassingly obviously, the most important part of a
> > > classification is the classification criteria, that is, the (real
> > > world) criteria that one uses to decide whether what is falling on my
> > > head is fine rain, drizzle, mist, rain, spitting, heavy rain, a
> > > downpour, cats and dogs or sleet. The concepts are not "out there"
> > > waiting to be written down, but essentially an arbitrary choice of how
> > > many terms are needed to divide up the concept space and where the term
> > > boundaries are.  The term "essentially arbitrary" implies that we may
> > > choose to make different choices. In practice, the choices are based on
> > > the "forms of life" that we need to distinguish - in industrial terms,
> > > the processes. When, as you were going out of the door, your mother
> > > shouted at you "its raining", this was not a statement about the amount
> > > of water falling from the sky, but an injunction to put a coat on.
> > >
> > > The idea that concepts are "out there" has been very influential (since
> > > at least Plato's "Republic"), but I suspect is a short cut we use in
> > > our thinking. In practice, the use of a term invokes many connotations
> > > - implied classifications and associations - which is why terminology
> > > debates are so confrontational and tediously long winded as these are
> > > teased out. My biggest concern in this whole discussion is that most of
> > > the definitions are being written using this "out there" thinking,
> > > rather than being explicit on when to use one term or when to use
> > > another in the same class. The danger is that we will produce a
> > > standard in geek speak - it works for the technologist, but not for the
> > > user.
> > >
> > > I am now going on holiday until the new year, so merry Christmas and a
> > > happy new year.
> > >
> > >
> > > Sean Barker
> > > 0117 302 8184
> > >
> > > -----Original Message-----
> > > From: mats.nilsson@fmv.se [mailto:mats.nilsson@fmv.se]
> > > Sent: 20 December 2006 08:43
> > > To: plcs-dex@lists.oasis-open.org
> > > Subject: SV: SV: SV: FW: [plcs-dex] Unique constraints ->
> > > identification and versioning
> > >
> > >                *** WARNING ***
> > >
> > > This mail has originated outside your organization, either from an
> > > external partner or the Global Internet. Keep this in mind if you
> > > answer this message.
> > >
> > >
> > > Hi,
> > >
> > > (See P.S. statement regarding the attachment and my approach to this
> > > discussion) (I've copied the section from David answer below on which
> > > I'd like to comment on)
> > >
> > > >  I understand the question now. From what I've seen on the Semantic
> > > > Web, the  best practice is to use a (somewhat) human-interpretable
> > > > name for the  identifiers of classes in an ontology (within the
> > > > limitations of what you can  use in a URL or URI).  I agree that the
> > > > use of rdfs:label is the proper way  to specify the "name" of the
> > > > class for use in browsers and GUI applications. However, I don't see
> > > > any advantage in not following the Semantic Web practices. I've never
> > > > really understood why anyone would want classes with  ids like
> > > > rd0049404 when they can have SerialNumber.
> > >
> > > 1. I'm not sure that the "Semantic Web best practice" is something we
> > > should pay to much attenention to, because imho PLCS Reference Data and
> > > Semantic Web ontologies are not that closely related, even though we
> > > use the same XML application (i.e. OWL) for the representation.
> > >
> > > 2. There will sooner or later be a case when homonyms appear in the
> > > same ontology. For now I have the two examples 'Tank' (container for
> > > liquid -or- combat vehicle) and 'Stone' (a unit of mesure -or- a
> > > primitive tool for emergency repairs). Both these examples are
> > > homoonyms likely to appear in the same domain (even though the 'Stone'
> > > example is a bit far-fetched...). In this case there still has to be a
> > > 'Stone(tool)'/'Stone(unit)' notation in order to separate them. A
> > > "meaninless" id string would bo more efficient.
> > >
> > > 3. You (David) did not comment on the real-world (...FMV...) fact that
> > > more than one word (synonyms) exists as "labels" for the same class.
> > > Which one should be used for the id? The use of the OWL "same_as"
> > > construct with separate classes (with identical definitions) is to me a
> > > more complicated way then using 'rdf:label' for the words and a
> > > "meaninless" id string for the class as a whole.
> > >
> > > 4. In the "interoperability" or "multilingual" oriented world there
> > > could also be a reason to keep the 'rdf:ID'='external_class.id' as a
> > > "meaninless" id string, in order to allow "labels" in different
> > > languages and not beeing forced to use an English word as the
> > > identifier... Why not adopt (what I think is) the eOTD approach. What
> > > they do and what we do are quite similar when it comes to "concept
> > > management" (where concept=id+label(s)+definition). Their "Core Model"
> > > (and perhaps the "FMV concept management information model"...
> > > (attached)) might be something to take a look at.
> > >
> > > I'm glad we got the discussion started! I hope more will join in...
> > > Reference Data is a key aspect to PLCS which in my opinion still is a
> > > bit too loosely defined.
> > >
> > > Regards,
> > >   Mats
> > >
> > > P.S.
> > >   The attached "FMV concept management information model" is still at a
> > > draft level (and has yet no descriptive text). Its purpose is to be the
> > > base for the definition of an XML based format for the representation
> > > of terminology used within FMV (and in the long run also for the
> > > Swedish armed forces). A project for addressing "concept management"
> > > will start at FMV in january with me as the projet leader.
> > >
> > > In order to be able to classify PLCS data correctly, the
> > > classifications should be based on a defined terminilogy. FMV doesn't
> > > have that today. In order for PLCS to work - this must be established!
> > > The aim of the project is first to create an infrastructure (data
> > > format, applications, processes, information/education and
> > > organisation), and then to launch the
> > > organisation and the work of creating a defined terminology. The
> > > infrastructure section of the project should be completed before
> > > summer! My ambition is, as far as it is possible, to use OWL in the
> > > same way as the OASIS PLCS TC specifies its use (something we'll soon
> > > have to agree on and do...) for our (FMV) terminology data format.
> > >
> > > This might explain some of my opinions expressed above and earlier...
> > >
> > >
> > >
> > >
> > > -----Ursprungligt meddelande-----
> > > Från: David Price [mailto:david.price@eurostep.com]
> > > Skickat: den 19 december 2006 17:30
> > > Till: plcs-dex@lists.oasis-open.org
> > > Ämne: Re: SV: SV: FW: [plcs-dex] Unique constraints -> identification
> > > and versioning
> > >
> > > Hi Mats, See below for two replies. Cheers, David
> > >
> > > On Tuesday 19 December 2006 09:38, mats.nilsson@fmv.se wrote:
> > > > Hi David,
> > > >
> > > > This is one of your examples of a "class.id URI";
> > > >
> > > > >> urn:iso:std:iso:ts:10303:-1017:ed-1:tech-taxonomy:Part
> > > >
> > > > If I understand you correctly, you suggests to include both the URI
> > > > for the RDL ("urn:iso:std:iso:ts:10303:-1017:ed-1:tech-taxonomy") as
> > > > well as the class identifier ("Part") in the 'external_class.id' (the
> > > > 'id' attribute in the 'external_class' entity).
> > > >
> > > > I thought (see the last of my three slides)
> > > > 'external_class_library.id' was going to be used for the URI of the
> > > > RDL, and that the identifier within the RDL (i.e.
> > > > 'external_class.id') only should contain the actual "classification"
> > > > or "term" identifier, in your example "Part".
> > >
> > > I don't think that works because of the other issues I mentioned (i.e.
> > > there are multiple ontologies involved and one ontology has to be
> > > identified as the context ontology). The context ontology is the most
> > > organization-specific ontology that uses the more general and standard
> > > ontologies. External_class_library is really the only entity type in
> > > PLCS that makes sense for that requirement and so I think there should
> > > be one instance of it that all the External_class entity instances
> > > point to (actually I don't think it's a big problem if there are
> > > multiple instances of External_class_library as long as they all refer
> > > to the same URI. So, if you've followed and agreed with the logic of
> > > requiring a context ontology then I think it's clear that the
> > > External_class.id needs to be the full URI.
> > >
> > > For what it's worth, I think people have been assuming that
> > > "urn:oasis:plcs" was "the reference data library", when in fact in
> > > real-world usage that is unlikely to be the case. The RDL that is the
> > > context for an exchange is actually the ontology developed by the using
> > > organization with its extensions to the PLCS standard classes which is
> > > imported in read-only mode. Because of the flexibility enabled by the
> > > use of the OWL language, it's important to have that context ontology
> > > named in the exchange file. If you look at some of the OWL APIs you'll
> > > see that they often force you to supply an ontology when you'd think
> > > only a class is required as input. That's because the same class can
> > > have different subclasses *and* superclasses (not to mention
> > > properties) depending on how it is extended in using ontologies.
> > >
> > > > Please help me understand if I've got things wrong! If someone else
> > > > has an opinion, please help David help me...
> > > >
> > > >
> > > > Now over to your question David. In my not so organized world (I call
> > > > it
> > > > FMV...) people use more than one term for the same concept
> > > > (concept=class). OWL has the 'rdfs:label' element, which makes it
> > > > possible to assign more than one term for each class. This is useful
> > > > for me because the guys who drive helocopters and those who drive
> > > > boats often have
> > > > different terminology, and I can use this functionality to make them
> > > > understand each other and the data they send. There is also this need
> > > > to be "interoperable" within e.g. the EU Battle Groups or NATO joint
> > > > operations, and then we swedes meet people that uses the word
> > > > "lubricate" for what we call "smörja"...
> > > >
> > > > To accompish this I'd like to use a "meaningless" identifier for the
> > > > 'external_class.id' field, e.g. "rd000453" (or with versioning
> > > > "rd000453v1"), and then use the 'external_class.name' field for the
> > > > readable classification (i.e. one of the available 'rdfs:label's in
> > > > the RDL/OWL-file).
> > > >
> > > > This was what I meant by the question;
> > > >
> > > > >> David: How do you suggest the label used for classification should
> > > > >> be identified in case there are multiple labels for the same
> > > > >> class/RD?
> > > >
> > > > If I have both "lubricate" and "smörja" in the same class (that is a
> > > > subclass of 'activity'/'task') with some unique id, I need to specify
> > > > which one is used.
> > > >
> > > > Clearer? Or don't you see this scenario with synonyms and multiple
> > > > languages (used for the same class/concept)?
> > >
> > > I understand the question now. From what I've seen on the Semantic Web,
> > > the best practice is to use a (somewhat) human-interpretable name for
> > > the identifiers of classes in an ontology (within the limitations of
> > > what you can use in a URL or URI).  I agree that the use of rdfs:label
> > > is the proper way to specify the "name" of the class for use in
> > > browsers and GUI applications. However, I don't see any advantage in
> > > not following the Semantic Web practices. I've never really understood
> > > why anyone would want classes with ids like rd0049404 when they can
> > > have SerialNumber. The only rationale I've heard that made any sense to
> > > me was related to handling the uniqueness of ids but since we're
> > > engineering the reference data I don't think the cost in human
> > > understandability is outweighed by the small benefit of slightly easier
> > > uniqueness. That said, I also think that the PLCS RD should be broken
> > > up into sub-ontologies on a
> > > domain-by-domain basis for manageability, subsetting and to help with
> > > the overloading of terms.
> > >
> > > All that said, I'm not sure that the External_class.name is really
> > > useful for transfering rdfs:label values. I'm not sure of the business
> > > need for that for a start. If the External_class.id is the full URI
> > > then that's sufficient for an application to process. If for some
> > > reason the rdfs:label is needed then I think name_assignment is the
> > > only way to handle the fact that a class may have multiple rdfs:label
> > > values for different languages. However, it seems to me it's better to
> > > keep all the labels in the ontology itself rather than duplicating them
> > > in the exchange file.
> > >
> > > > Regards,
> > > >   Mats
> > > >
> > > >
> > > >
> > > > -----Ursprungligt meddelande-----
> > > > Från: David Price [mailto:david.price@eurostep.com]
> > > > Skickat: den 18 december 2006 18:05
> > > > Till: plcs-dex@lists.oasis-open.org
> > > > Ämne: Re: SV: FW: [plcs-dex] Unique constraints -> identification and
> > > > versioning
> > > >
> > > > Hi Mats, a few replies follow (although I'm confused by one question.
> > > >
> > > > On Monday 18 December 2006 07:51, mats.nilsson@fmv.se wrote:
> > > > > Questions below...
> > > > > Happy for opinions!
> > > > >
> > > > > Regards,
> > > > >   Mats
> > > > >
> > > > > >> David: Could you please give an example of what an (external)
> > > > > >> class.id URI could look like?
> > > >
> > > > It would be a URN or a URL depending on what organization defines it
> > > > the class and the approach they happen to have adopted. It would be
> > > > the compete URI for the class though it's technically only the
> > > > identifier and so may not be sufficient for location (e.g. if it's a
> > > > URN then some other means would have to be established for an
> > > > application/user to find more info about the class ... for example,
> > > > an organization might have to buy an ISO standard). Examples could
> > > > be:
> > > >
> > > > urn:iso:std:iso:ts:10303:-1017:ed-1:tech-taxonomy:Part
> > > >
> > > > http://schema.omg.org/spec/UML/2.1/ParameterDirectionKind
> > > >
> > > > http://www.madeupdod.mil/ActivityOntology#Training
> > > >
> > > > > >> David: How do you suggest the label used for classification
> > > > > >> should be identified in case there are multiple labels for the
> > > > > >> same class/RD?
> > > >
> > > > I don't understand what "the label used for classification" means.
> > > > Can you rephrase the question or explain that phrase?
> > > >
> > > > Cheers,
> > > > David
> > >
> > > --
> > > Mobile +44 7788 561308
> > > UK +44 2072217307
> > > Skype +1 336 283 0606
> > >
> > >
> > >
> > > ********************************************************************
> > > This email and any attachments are confidential to the intended
> > > recipient and may also be privileged. If you are not the intended
> > > recipient please delete it from your system and notify the sender.
> > > You should not copy it or use it for any purpose nor disclose or
> > > distribute its contents to any other person.
> > > ********************************************************************

-- 
Mobile +44 7788 561308
UK +44 2072217307
Skype +1 336 283 0606


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]