[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Rationale for pursuing Dataweb architecture
XDI TC Members and Observers, As published today in the draft minutes of the F2F meeting two weeks ago in Denver (see http://www.oasis-open.org/apps/org/workgroup/xdi/download.php/10001/MINUTES% 20OF%2010-28-29-04%20XDI%20TC%20FACE%20TO%20FACE%20MEETING%20%28Official%29. txt), the core topic discussed at the meeting was the two potential architectural models the XDI TC could follow. These can be loosely summarized as the "data envelope" or "SOAP-for-data" model and the "Dataweb" or "HTML-for-data" model. While most of you know I am a strong Dataweb architecture advocate, some of the concepts from the data envelope model are very attractive, and they have very much influenced my thinking about the Dataweb model. This is reflected in a new schema proposal and several example documents using this schema that I posted last night: * New schema proposal: http://www.oasis-open.org/committees/download.php/9988/draft-xdi-dataweb-sch ema-v1.xsd * Simple XDI business card (w/all data referenced): http://www.oasis-open.org/committees/download.php/9989/draft-example-dataweb -bizcard-short-v1.xml * Long-form XDI business card (w/all references resolved): http://www.oasis-open.org/committees/download.php/9990/draft-example-dataweb -bizcard-long-v1.xml * Example of XDI Descriptor in this XDI format: http://www.oasis-open.org/committees/download.php/9991/draft-example-dataweb -XRID-v1.xml However, in doing through this work, and after another good conversation with Dave last Friday, I have become more deeply convinced about the Dataweb model. This email summarizes my rationale in preparation for further discussion on today's TC call. It breaks into three parts: * Value proposition for the Dataweb * The role of XDI dictionaries * The need for an XDI Logical Data Object Model (LDOM) VALUE PROPOSITION FOR THE DATAWEB The root of my rationale is the core value proposition that "XDI can do for global data sharing what the Web did for global content sharing." Here's a more detailed way of framing that value proposition that Dave and I discussed last Friday. It starts with the value proposition for the Web: ***Value Proposition for the Web*** With the Web, we wanted to create a single presentation engine (browser) for all content without knowing anything directly about the content. Besides the visualization markup, the presentation engine doesn't need to know anything about the content. Although this presented potentially a huge barrier to adoption - the need for every content publisher to markup their content in this new markup format - there was a value proposition that successfully drove millions of content publishers to do just that: "If you put your content into this format, it can be: a) rendered on every desktop in the world, and b) referenced and linked to/from any other content in the world, and c) searched and indexed by any content search engine in the world." ***** Bingo! The result is history. The greatest transformation of global information infrastructure ever. The core concept of the Dataweb is to do the same thing for machine-readable data that the Web did for human-readable content. In fact, we can express this as literally a word-for-word transposition of the above value proposition: ***Value Proposition for the Dataweb*** With the Dataweb, we want to create a single data interchange engine (i-broker) for all data without knowing anything directly about the data. Besides the data control markup, the data interchange engine doesn't need to know anything about the data. Although this presents potentially a huge barrier to adoption - the need for every data publisher to markup their data in this new markup format - there is a value proposition that can successfully drive millions of data publishers to do just that: "If you put your data into this format, it can be: a) interchanged with every system in the world, and b) referenced and linked to/from any other data in the world, and c) searched and indexed by any database search engine in the world." ***** To me, this perfectly describes the goal of XDI: a common data interchange format (represented by a single common XML schema) together with a common data interchange service for adding, modifying, deleting, and processing XDI documents. DATAWEB DICTIONARIES Whatsmore, when we're operating at the level of machine-readable data vs. human-readable content, I believe there is another major element to the Dataweb value proposition that is missing (in a direct way) from the Web value proposition: Dataweb dictionaries. Again this is probably best described via analogy to the Web. Arguably the single most valuable aspect to the Web is the ability to locate desired content almost instantly, using search engines such as Google. However this only works because of a simple fact: human languages inherently consist of shared dictionaries of concepts ("keywords") with which the search engines can create their indexes. It is only due to our common knowledge of these dictionaries (the copies we all carry around in our own heads) that search engines can do their magic. Otherwise they wouldn't know how to index and we wouldn't know what to enter as search criteria. When it comes to the Dataweb, and we move from the sphere of human-readable content to machine-readable data, this problem is magnified immensely. The biggest single problem with sharing machine-readable data across systems is that there are no humans in the loop to do the "fuzzy matching" that humans are so good at (and that search engines like Google can help so much with). In order to actually share data across systems, machines need to be able to do *exact bit-for-bit matching*. No ambiguity. The problem gets even worse when we consider that today there does not exist anything close to a universal data dictionary from which such matching could be done. In other words, it's not like the Web, where all the dictionaries (common vocabularies of human language) already existed, and we just needed to find a common way to represent them. With the Dataweb, the dictionaries don't even exist yet. In fact, the closest thing to those dictionaries are the existing XML schemas or RDF vocabularies that have been created in order to establish common semantics for data interchange. So I would argue that, just as it became a fundamental design goal of XML to make XML schemas expressable in XML itself (thus leading to the W3C XML Schemas specification), it must be a fundamental design goal of XDI to make XDI dictionaries expressable in XDI itself. Because unlike XML, which had DTDs to turn to, XDI implementations will have no practical way of interoperating without XDI dictionaries. XDI dictionaries are the only way to get the direct bit-for-bit data matching necessary for true interoperability. THE NEED FOR AN XDI DATA OBJECT MODEL As discussed above, the Web solved the problem of content interoperability by adopting a single markup format, HTML, which any rendering engine (browser) could display. This common format, which later led to the development of XML, also led to a common object model for parsing and manipulating "document objects". This was the Document Object Model (DOM). It follows that if data-oriented systems are to adopt a common model for data interchange, and if this model is to be based on a common XML data format, this format must reflect a common logical data object model, or LDOM. To be universal, the LDOM must be very simple and capable of expressing fundamental relationships between data elements the same way XML expresses fundamental relationships between content elements. In the work over the past six months, we have been looking at XDI schema proposals that boiled this down to just two types of relationships: 1) hierarchical relationships, and b) peer-to-peer, or "web" relationships. The other key requirement of an LDOM is that every data element be uniquely addressable (just as it is in a database). Thus the requirement in the schema proposals so far that every resource be addressable via at least one XRI. A successful LDOM, then, would be representable in a single XML schema that, while capable of carrying existing XML data as a "payload", would inherently require markup of some metadata into this new format, just as HTML was capable of carrying existing text and graphics but required at least some markup in HTML format. That, in a nutshell, is what I believe we should be driving for with the XDI schema. ***EOM***
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]