set message

Subject: Further responses to comments
From: "Stephen Green" <stephen.green@documentengineeringservices.com>
To: set@lists.oasis-open.org
Date: Wed, 26 Nov 2008 17:04:56 +0000
Further answers to Dale Moberg's comments on my requirements slides

One of my slides said


"A Requirements Proposal



There are also variations in context scheme (CCTS, UCM,etc)

D1. Namespaces (may vary or may be the same)

D2. Models

D3. Core Components

D4. Core Components Harmonization Group (private, TBG17,organization, etc)

D5. Underlying syntax (XML, ASN.1, EDI, etc)

D6. Variations in basic datatypes (and codelists)

D7. Naming and Design Rules (UBL, ATG2, etc)

D8. Context / Purpose (D8.1, D8.2, etc)

D9. Context Scheme



Another set of slides Dale summarised as:



"Concern with OWL supporting knowledge base maintenance (dolphin fish
gill example)"


I'll relate Dale's comments with my responses inline:


<Comment>

I like this inventory of the source of variabilities, and it helps me
get closer to thinking in terms of encoders and decoders,
automagically produced. However,

Could you explain what a harmonization group's goals and results look
like a bit more? How does it impact producing Models, for example? How
does it relate to variations in D7? or D6? Isn't harmonization just
some approach to relating variations along the other dimensions? If
not, (and I suspect it is not the same but I have not been on such a
group), can you clarify?

</Comment>


   <Response>

OK. The harmonisation group (typically - there could be any number of
such groups - though TBG17 is the key one for UBL, etc)
takes submissions in the form of core components (CCs), business
information entities (BIEs) and qualified datatypes (QDTs).
The key work is first to produce a set of CCs which meet the
requirements of all submissions. This is the CC Library (CCL). It
should (maybe MUST???) be such that all submitted BBIEs can be 'based'
on a CC (I'm not sure if 'based' is the CCTS term -
that is a key concept in CCTS but a common understanding of the term
would be welcome, though elusive it seems to me).
The aim seems also, of late, to produce a set (library?) of harmonised
BIEs but that is something I've not followed closely of late
and don't understand yet the principles/requirements for how the BIEs
are harmonised into a single set. For now I  think we can
assume we will be mainly looking at the submitted BIEs rather than the
harmonised ones but that may be just for now. Lastly
it seems to me to be important to your comment that the BIEs may,
after harmonisation (if it goes well) be based on a common
set of CCs (if they are harmonised by the same group) and if the BIEs
have datatypes which differ, then the CCs on which they
are based might or might not differ. The CCs use cruder datatypes. If
two BBIEs (the 'leaf nodes' of the model to use an XML
picture) have datatypes which are based on different core (core
component types) datatypes then they will need to be based on
different CCs, even if they have the same semantics. e.g. an Invoice
Number defined in one vocab as an invoice number based
on the Number type and in another vocab (or customisation) the Invoice
number is based on the Identifier datatype then they
cannot (as far as I know) be based on the same CC so two CCs will be
needed for the semantically equivalent entities for Invoice
Number. In most cases though there might be serendipitous alignment of
underlying datatypes, perhaps differing only in how
they are each specialised ('qualified'). Using mainly the unqualified
(one level of concretion up from core component type)
datatypes in a vocab might help the chances of this, or might not -
I'm not sure (interesting to hear from Scott on this). So the
hanrmonisation groups' efforts might also include some harmonisation
of the datatypes at least as far as identifying datatypes
for its own harmonised BIEs but more likely going further and actually
producing a set of qualified datatypes to be used outside
that BIE 'library' too (not sure library is what is being produced as
yet). Interesting, again, to hear more on the latest developments
of this - whether or not my expectations and understandings are correct.

   </Response>


<Comment>

I think your points about OWL and versioning and inconsistencies is OK
but I cannot imagine any formalism that could avoid such problems
unless it disallowed negation in any form... I am less worried about
the formalism than about the kind of knowledge that is to be put into
it.

</Comment>


   <Response>

 I think OWL 2.0 will probably solve this anyway, but it would have to
get adoption to warrant use by SET perhaps (taking SET
reasons for using in OWL as being relative ubiquity).

What is put into the formalism, as it were, does need to be
versionable though with a reasonable way to handle changes,
resulting redundancy and deprecation. 'Knowledge' is a changeable
entity and is constantly revised and refined making earlier
'knowledge' obsolete or considered unacceptably inaccurate.

   </Response>


<Comment>

For example, are semantic constraints ones that ensure the values of
data exchanged are understood by computational processes of either
party in the "correct way" to ensure proper business interaction (so
that a "container" of a product is not understood to be a bottle on
one side, and a ocean going shipping steel box on the other). Or is
our semantic model an ontology of the "document," understood as a
bunch of aggregated BCCs and ASBIEs etc? How do we decide which kind
of semantics is needed for translational fidelity? Or do we have
reasons why following the constraints in terms of composition our of
BCCs and so forth must also promote correct business interaction?

</Comment>


   <Response>

Good questions indeed. I probably can't do them justice but I'll try.
Using CCTS to help us is what we seem to be primarily about
in SET TC. The CCTS has a core basis not just in the CC versus BIE
'metamodel' (is that a right use of the word 'metamodel'?
I suspect maybe the CCTS UMM and the OMG UML concepts might not align
perfectly since data is not exactly the same as
objects in a document context - maybe...). It also implements the Data
Naming (for data dictionary) principles of ISO 11179 (as
far as I know). This gives each BIE, CC, datatype a dictionary entry
name which should (within a harmonisation scope?) be
unique and say unambigous things about the semantics, typing and to
extent maybe the syntax (in UBL's case at least) of
the BIE, CC of datatype. This meaning can of course be
'reverse-engineered' / derived from the dictionary entry name to some
extent. My take on SET so far is that it is attempting to do this
'reverse-engineering' kind of derivation and store the semantics
and type information as well as the semantically important harmonised
core component information in an OWL artefact. Now the
question you posit still remains (I can't really answer it myself ...
yet???): How, if at all, does this help to ensure that the
qualification
and all the semantics of a BIE can be properly mapped to that of
another across vocabularies or documents? I know there are
some aspects which should be helped, such as mappings of datatypes and
cross-referencing to core components and from there
to any BIEs in the other vocabulary which are based on the same or
related core components: 'Related' if they, say, differ only
in datatype but not in name. But does differing in datatype and not in
name mean they are related semanically at all??? I'm not
sure. Once you get out of CC into BIE there is qualification to
consider along with context. Maybe the harmonisation group will
have ensured that two BIEs only have the same qualifier (taking
limited vocabulary efforts into account) of the semantic
qualification is the same. I don't know. I guess human judgement is
still important to check this.

   </Response>






-- 
Stephen D. Green

Document Engineering Services Ltd



http://www.biblegateway.com/passage/?search=matthew+22:37 .. and voice