xdi message

Subject: Rationale and semantics for the 3rd markup model
From: "Drummond Reed" <drummond.reed@cordance.net>
To: <xdi@lists.oasis-open.org>
Date: Sat, 27 Nov 2004 14:02:45 -0800
I had an action item from Wednesday's XDI TC call (for which Marc is still
working on the minutes) as follows:

# Start a thread on the list to: a) solicit our collective rationale for
using each of the three XDI markup models, and b) out of this, develop
consensus on the name we should use for the third model. 

The three XDI markup models were summarized in
http://lists.oasis-open.org/archives/xdi/200411/msg00041.html. The first two
models are both "XML-centric", i.e., in both cases XDI markup is added to
existing XML markup in order to gain the data sharing virtues of XDI while
still leveraging any existing investment in XML schema, tools, and
applications.

The only difference between first two models comes down to whether the
existing XML schemas being marked up with XDI are extensible or not:

1) IF THE EXISTING XML SCHEMAS ARE EXTENSIBLE, then XDI markup can be placed
directly inline with the existing XML. Thus the consensus on the call was to
call this XDI markup model "Inline".

2) IF THE EXISTING XML SCHEMAS ARE NOT EXTENSIBLE, then XDI markup must be
added in an envelope wrapping the existing XML. The consensus on the call
was to call this XDI markup model "Enveloped". 

Because these two models are so clear, most of the discussion on Wednesday's
call focused on the third model, and when and why an implementer would want
to use it. Three rationales were put forth:

  1) When richer resource description is needed than is available using XML
QNames (and thus is better provided by XRIs).

  2) When the data being marked up does not yet have an XML schema.

  3) When the data being aggregated in a single XDI document does not lend
itself to a particular XML schema because comes from many different data
sources and there is no single XML schema that makes sense for aggregating
all of it.

After the call, I realized the killer example of the third type of data is
when the resource being described is one as generalized as "person",
"organization", or "topic". It would be all but impossible to create a
single XML schema that can describe all the data that might be associated
with a person, an organization, or a topic. The reason is that these are
simply very general concepts, that can be reused in thousands or millions of
data sharing applications.

It dawned on me that this made perfect sense. The universe of XML schemas
today exist mostly to describe data in particular contexts, i.e., existing
applications and data stores that already have their own schema. Moving the
data in XML in this case is the first step in being able to share it with
other applications, domains, and Web services.

But the more generalized the data, the harder it becomes to completely
satisfy this need with conventional XML, because the data is less and less
"schema-specific" and conventional XML is designed to put data in a specific
schema context.

It is for these classes of applications, where data may be reused across
many (i.e., hundreds or thousands) of different XML schemas, that the real
demand for XDI arises. Of course all 3 markup models supply the solution:
XDI markup can identify and describe the data in a schema-independent
manner, so the same logical data can be identified and shared across many
different physical XML schema instances. But if the distinguishing criteria
of the first two models is that the data already exists and makes sense to
be used in the context of existing XML schemas, then the distinguishing
criteria of the third model is that either:

a) No XML schema currently exists to provide the data context, or 
b) The context is so general that no single XML schema CAN exist to provide
sufficient context.

Although in the first case it can be argued that the easiest route to
establishing the context is to simply create a new conventional XML schema,
in the second case that's not an option. There is no such schema. The only
solution is a "generalized" or "universal" XML schema that exists to
describe things independent of any one specific XML schema context. That's
the third model.

Loren predicted that the right term for the third model would emerge once
the rationale for the third model was clear. I would summarize the
rationaled for using the third model as "the schema to use when you need
express and reuse data in a generalized, universal, XML-schema-independent
format". This would suggest the following names:

General model
Universal model
Independent model
Schema-independent model
Metaschema model
Abstract model (Victor's suggestion on the call)

Votes? Other suggestions?

=Drummond
Follow-Ups:
- RE: [xdi] Rationale and semantics for the 3rd markup model
  - From: "Loren West" <loren.west@epok.net>