Subject: Groups - Semantic Representations of the UN/CEFACT CCTS-based Electronic Business Document Artifacts (20080924SemanticRepresentationOfDocumentArtifacts.doc) uploaded

The document named Semantic Representations of the UN/CEFACT CCTS-based
Electronic Business Document Artifacts 
(20080924SemanticRepresentationOfDocumentArtifacts.doc) has been submitted
by Ms. Asuman Dogac* to the OASIS Semantic Support for Electronic Business
Document Interoperability (SET) TC document repository.

Document Description:
The purpose of this SET TC deliverable is to provide standard semantic
representations of 
electronic document artifacts based on UN/CEFACT Core Component Technical
(CCTS) and hence to facilitate the development of tools to support semantic
The basic idea is to explicate the semantic information that is already
given both in the 
CCTS and the CCTS based document standards in a standard way to make this

information available for automated document interoperability tool support.

UN/CEFACT CCTS specifies the semantics of document artifacts in several
through the Core Components Data Types; through the structure of the core
components; the 
semantics implied by the naming convention used; the semantics implied by
the context, the 
Business Information Entities and the code lists. However, currently this
semantics is available only through text-based search mechanisms. 

In order to help with the interoperability of the document artifacts, we
explicate the CCTS 
based business document semantics. By "explicating", we mean to
define their semantic 
properties through a formal, machine processable language as an ontology
and the Web 
Ontology Language (OWL) is used for this purpose. Note that in defining the
properties of document artifacts, we kept the "context" semantics

at an absolute minimum since UN/CEFACT UCM is working on this subject.

The semantics is explicated at two levels: At the first level, an upper
ontology describing the 
CCTS document content model is specified. Furthermore, at this level, the
upper ontologies for 
the prominent CCTS based standards, namely, GS1 XML, OAGIS 9.1 and UBL are
developed. The various equivalence relationships between the classes of the
CCTS upper 
ontology and the CCTS based document standard ontologies are defined. These
are later used to find the similarities among the document artifacts from
different document 

At the next level, the semantics of the document schemas in each standard
are described 
based on its upper ontology. The difference between the document schema
specific ontology 
and the upper ontology is that the upper ontology describes the generic
entities in a document 
content model whereas document schema ontologies describe the actual
document artifacts 
as the subclasses of the classes in the upper ontology.

Furthermore, we explicate some semantics related with the different usages
of document data 
types in different document schemas to obtain some desired interpretations
by means of such 
informal semantics. The intention is to give the reasoner the same
information that the 
humans use in transforming document schemas into one another.

When these ontologies are harmonized using a DL reasoner, the computed
ontologies reveal the implicit equivalences and subsumtion relationships
between the 
document artifacts. In other words, the shared semantic properties of the
CCTS based 
document artifacts together with the implicit relationships inferred, help
to identify their 
similarities. As expected, the harmonized ontology is effective only to
discover equivalence of 
both semantically and structurally similar document artifacts. Yet
different document standards 
use core components in different structures. Semantic properties of
document artifacts are not 
enough to find the similarity of the structurally different but
semantically equivalent document 
artifacts; possible differences in structures must be provided through
heuristics to enhance the 
practical uses of the specified semantics. This heuristics is about
possible ways of organizing 
core components into compound artifacts and is given in terms of predicate
logic rules. 
Note that a DL reasoner by itself cannot process predicate logic rules and
we resort to a well 
accepted practice of using a rule engine to execute the more generic rules
and carry the 
results back to the DL reasoner through wrappers developed. The results
involve declaring 
further class equivalences in the ontology.

Finally, the similarities discovered among the document artifacts are then
used to automate 
the mapping process by generating the XSLT rules.

The SET harmonized ontology contains about 4758 Named OWL Classes and 16122

Restriction Definitions conforming to the specification described in this
document consisting of 
the following:
-	All of the CCs/BIEs in UN/CEFACT CCL 07B.
-	All of the BIEs in the common library of UBL 2.0.
-	All of the common library of GS1 XML. 
-	OAGIS 9.1 Common Components and Fields
-	The harmonized ontology expresses the relationships among the document
artifacts of 
	UN/CEFACT CCL, UBL 2.0, OAGIS 9.1 and GS1 XML according to SET
-	The SET harmonized ontology is publicly available from 

Related with performance, an issue that needs to be addressed is whether
the gain in automation 
justifies the resources needed to develop the ontological representation of
the document schemas. In 
order to reduce this cost, we provide the SET XSD-OWL Convertor tool to
create OWL definitions of 
the document schemas. This component converts a CCTS based document schema
TC OWL Definition and is publicly available from 

Note that, by conforming to a standard ontological representation and hence
having all the document 
schema ontologies in a common pool, the users of the harmonized ontology
only need to create a 
document schema ontology if it is not already in the harmonized ontology
and benefit from all the 
existing connections when they do so.

Another issue related with performance is the computational complexity of
the reasoning process 
involved. On a PC with 2GB RAM, the Racer Pro 1.9.2 Beta reasoner takes
about 120 seconds to 
compute the harmonized ontology. SET TC Members will receive a password to
use Racer Pro
for free for three months. Considering that the harmonized ontology will be
re-computed only 
when a new document schema or a new CCTS based upper document ontology is
introduced to the 
system, this performance is quite acceptable.

This work will be discussed to be further enhanced in the SET TC and
technical support will be 
provided to the SET TC Members who develop their own use cases using the
ontology. The SET XSD-OWL Converter tool can be used to generate the OWL
definitions of 
their own document artifacts. The aim is to demonstrate the feasibility and
practicability of the 
specifications to encourage industry take up.

This document is an OASIS Semantic Support for Electronic Business Document

Interoperability (SET) TC Working Draft Profile and the work by the Editors
is realized within 
the scope of the ICT 213031 iSURF Project (http://www.iSURFProject.eu)
sponsored by 
the European Commission, DG Enterprise Networking Unit 

Committee members should send comments on this specification to the
list. Others should subscribe to and send comments to the
list. To subscribe, send a blank email message to
Once you confirm your subscription, you may post messages at any time.  

