OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

xdi message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Rationale for pursuing Dataweb architecture


XDI TC Members and Observers,

As published today in the draft minutes of the F2F meeting two weeks ago in
Denver (see
http://www.oasis-open.org/apps/org/workgroup/xdi/download.php/10001/MINUTES%
20OF%2010-28-29-04%20XDI%20TC%20FACE%20TO%20FACE%20MEETING%20%28Official%29.
txt), the core topic discussed at the meeting was the two potential
architectural models the XDI TC could follow.

These can be loosely summarized as the "data envelope" or "SOAP-for-data"
model and the "Dataweb" or "HTML-for-data" model.

While most of you know I am a strong Dataweb architecture advocate, some of
the concepts from the data envelope model are very attractive, and they have
very much influenced my thinking about the Dataweb model. This is reflected
in a new schema proposal and several example documents using this schema
that I posted last night:

* New schema proposal:
http://www.oasis-open.org/committees/download.php/9988/draft-xdi-dataweb-sch
ema-v1.xsd

* Simple XDI business card (w/all data referenced):
http://www.oasis-open.org/committees/download.php/9989/draft-example-dataweb
-bizcard-short-v1.xml

* Long-form XDI business card (w/all references resolved):
http://www.oasis-open.org/committees/download.php/9990/draft-example-dataweb
-bizcard-long-v1.xml

* Example of XDI Descriptor in this XDI format:
http://www.oasis-open.org/committees/download.php/9991/draft-example-dataweb
-XRID-v1.xml

However, in doing through this work, and after another good conversation
with Dave last Friday, I have become more deeply convinced about the Dataweb
model. This email summarizes my rationale in preparation for further
discussion on today's TC call. It breaks into three parts:

* Value proposition for the Dataweb
* The role of XDI dictionaries
* The need for an XDI Logical Data Object Model (LDOM)

VALUE PROPOSITION FOR THE DATAWEB

The root of my rationale is the core value proposition that "XDI can do for
global data sharing what the Web did for global content sharing." Here's a
more detailed way of framing that value proposition that Dave and I
discussed last Friday. It starts with the value proposition for the Web:

***Value Proposition for the Web***

With the Web, we wanted to create a single presentation engine (browser) for
all content without knowing anything directly about the content. Besides the
visualization markup, the presentation engine doesn't need to know anything
about the content.

Although this presented potentially a huge barrier to adoption - the need
for every content publisher to markup their content in this new markup
format - there was a value proposition that successfully drove millions of
content publishers to do just that:

	"If you put your content into this format, it can be: a) rendered on
every desktop in the world, and b) referenced and linked to/from any other
content in the world, and c) searched and indexed by any content search
engine in the world."

*****

Bingo! The result is history. The greatest transformation of global
information infrastructure ever.

The core concept of the Dataweb is to do the same thing for machine-readable
data that the Web did for human-readable content. In fact, we can express
this as literally a word-for-word transposition of the above value
proposition:

***Value Proposition for the Dataweb***

With the Dataweb, we want to create a single data interchange engine
(i-broker) for all data without knowing anything directly about the data.
Besides the data control markup, the data interchange engine doesn't need to
know anything about the data.

Although this presents potentially a huge barrier to adoption - the need for
every data publisher to markup their data in this new markup format - there
is a value proposition that can successfully drive millions of data
publishers to do just that:

	"If you put your data into this format, it can be: a) interchanged
with every system in the world, and b) referenced and linked to/from any
other data in the world, and c) searched and indexed by any database search
engine in the world."

*****
To me, this perfectly describes the goal of XDI: a common data interchange
format (represented by a single common XML schema) together with a common
data interchange service for adding, modifying, deleting, and processing XDI
documents.

DATAWEB DICTIONARIES

Whatsmore, when we're operating at the level of machine-readable data vs.
human-readable content, I believe there is another major element to the
Dataweb value proposition that is missing (in a direct way) from the Web
value proposition: Dataweb dictionaries. Again this is probably best
described via analogy to the Web.

Arguably the single most valuable aspect to the Web is the ability to locate
desired content almost instantly, using search engines such as Google.
However this only works because of a simple fact: human languages inherently
consist of shared dictionaries of concepts ("keywords") with which the
search engines can create their indexes. It is only due to our common
knowledge of these dictionaries (the copies we all carry around in our own
heads) that search engines can do their magic. Otherwise they wouldn't know
how to index and we wouldn't know what to enter as search criteria.

When it comes to the Dataweb, and we move from the sphere of human-readable
content to machine-readable data, this problem is magnified immensely. The
biggest single problem with sharing machine-readable data across systems is
that there are no humans in the loop to do the "fuzzy matching" that humans
are so good at (and that search engines like Google can help so much with).
In order to actually share data across systems, machines need to be able to
do *exact bit-for-bit matching*. No ambiguity.

The problem gets even worse when we consider that today there does not exist
anything close to a universal data dictionary from which such matching could
be done. In other words, it's not like the Web, where all the dictionaries
(common vocabularies of human language) already existed, and we just needed
to find a common way to represent them. With the Dataweb, the dictionaries
don't even exist yet.

In fact, the closest thing to those dictionaries are the existing XML
schemas or RDF vocabularies that have been created in order to establish
common semantics for data interchange.

So I would argue that, just as it became a fundamental design goal of XML to
make XML schemas expressable in XML itself (thus leading to the W3C XML
Schemas specification), it must be a fundamental design goal of XDI to make
XDI dictionaries expressable in XDI itself. Because unlike XML, which had
DTDs to turn to, XDI implementations will have no practical way of
interoperating without XDI dictionaries. XDI dictionaries are the only way
to get the direct bit-for-bit data matching necessary for true
interoperability.

THE NEED FOR AN XDI DATA OBJECT MODEL

As discussed above, the Web solved the problem of content interoperability
by adopting a single markup format, HTML, which any rendering engine
(browser) could display. This common format, which later led to the
development of XML, also led to a common object model for parsing and
manipulating "document objects". This was the Document Object Model (DOM).

It follows that if data-oriented systems are to adopt a common model for
data interchange, and if this model is to be based on a common XML data
format, this format must reflect a common logical data object model, or
LDOM.

To be universal, the LDOM must be very simple and capable of expressing
fundamental relationships between data elements the same way XML expresses
fundamental relationships between content elements. In the work over the
past six months, we have been looking at XDI schema proposals that boiled
this down to just two types of relationships: 1) hierarchical relationships,
and b) peer-to-peer, or "web" relationships.

The other key requirement of an LDOM is that every data element be uniquely
addressable (just as it is in a database). Thus the requirement in the
schema proposals so far that every resource be addressable via at least one
XRI.

A successful LDOM, then, would be representable in a single XML schema that,
while capable of carrying existing XML data as a "payload", would inherently
require markup of some metadata into this new format, just as HTML was
capable of carrying existing text and graphics but required at least some
markup in HTML format.

That, in a nutshell, is what I believe we should be driving for with the XDI
schema.

***EOM***






[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]