topicmaps-comment message

Subject: RDF/Topic Maps: late/lazy reification vs. early/preemptive reification(was: Re: [topicmaps-comment] RE: OASIS vs W3C)
From: "Steven R. Newcomb" <srn@coolheads.com>
To: larsga@garshol.priv.no, sam_hunting@yahoo.com, cmsmcq@acm.org
Date: Thu, 27 Sep 2001 17:56:33 -0500
For me, at least, the shortest, most compelling and
cogent demonstration of a certain critical difference
between Topic Maps and RDF was Michael
Sperberg-McQueen's wrap-up keynote at the Extreme
Markup Languages Conference (www.extrememarkup.com)
last August.

N.B.: This note is about *what I learned* from
      Michael's presentation, and it does not
      necessarily reflect Michael's views, or even
      constitute an accurate account of Michael's
      presentation.  It's merely what I remember about
      it.  (I love Michael's wrap-ups at the Extreme
      conferences.  It's a good thing he traditionally
      speaks last, because he's a hard act to follow.)

Michael brought colored ribbons and other paraphernalia
to the podium, in order to illustrate his words.

"Tom buttered the bread," was the statement Michael
wanted to represent.  There being no volunteers in the
audience named "Tom", Michael appointed our Conference
Chair, Tommie Usdin, to represent the "Tom" node.  Syd
Bauman, as I recall, was appointed to represent the
"bread" node.  A blue ribbon between Tommie and Syd
represented the arc representing the statement that Tom
buttered the bread.

  [Use a monospace font, such as courier, if you want
  to see the ASCII art as it was intended to be seen.]

     Tommie -----------> Syd
     ("Tom")   (blue     ("the bread")
               ribbon)

So far, so good.  "Now," Michael said, "What if I want
to say that Tom buttered the bread with a knife?  In
order to attach the knife to this statement, I need a
node for the knife, and I also I need a node to
represent the buttering itself.  (There must be some
sort of a 'buttering event' going on here.)"  After
everyone had finished laughing over our internal
visualizations of "a buttering event", Kate Hamilton
was appointed to be the node that represented the
"buttering event".  A differently colored ribbon was
used to connect Tommie to Kate, and Kate to Syd.  Now
there was a triangle of ribbons, because Tommie was
still *also* buttering Syd by virtue the original blue
ribbon.  

                Kate ("the Buttering")
                 /\
                /  \
               /    \
              /      \
             /        \
            /          \
           /            \
     Tommie -----------> Syd
     ("Tom")   (blue     ("the bread")
               ribbon)


Now, with Kate in existence, it was possible
to use yet another ribbon color to connect the knife to
Kate (the "buttering event").

knife --------- Kate ("the Buttering")
                 /\
                /  \
               /    \
              /      \
             /        \
            /          \
           /            \
     Tommie -----------> Syd
     ("Tom")   (blue     ("the bread")
               ribbon)

So now Kate was holding one end of each of three
ribbons: one to Tom, one to the bread, and one to the
knife.  Michael then proposed to further modify the
statement: "Tom buttered the bread with a knife *on
Friday*".  Yet another volunteer became "Friday", and
yet another ribbon was given to Kate, the other end of
which was "Friday".  

knife --------- Kate ("the Buttering")
              /  /\
Friday ------+  /  \
               /    \
              /      \
             /        \
            /          \
           /            \
     Tommie -----------> Syd
     ("Tom")   (blue     ("the bread")
               ribbon)

It was clear that Michael could have gone on to attach
any number of things to Kate; the "buttering event" had
a limitless capacity to be related to other things.
Indeed, by the end of Michael's wrap-up keynote, Kate
was already holding one end of several ribbons,
including the two ribbons needed to connect Tommie
(Tom) to Syd (the bread).

It was also clear that, once "the buttering event"
existed as a distinct node, it was no trouble at all to
say anything about that event.  However, *before* Kate
was appointed to be that node, there was no way to say
anything about the buttering event.

After the "buttering event" node represented by Kate
was brought into existence, the combination of itself
with its arcs to Tommie (Tom) and to Syd (the bread)
was sufficient to represent the fact that "Tom buttered
the bread".  Therefore, once the "buttering event"
existed, there was no further need for the original
blue ribbon connecting Tommie (Tom) and Syd (the
bread).  The blue ribbon was redundant, and it
unnecessarily complicated the graph of ribbons and
nodes.  The blue ribbon should go away, right?

knife --------- Kate ("the Buttering")
              /  /\
Friday ------+  /  \
               /    \
              /      \
             /        \
            /          \
           /            \
     Tommie              Syd
     ("Tom")             ("the bread")

In Topic Maps, there is no way to say "Tom buttered the
bread" without creating an explicit "buttering event"
-- a "buttering association" between Tom and the bread.
Instead of making a direct connection between Tom and
the bread, Topic Maps forces us to create a "buttering
event" node, and to connect "Tom" and "the bread" to
that node.  The advantage here is that we can always
say something new about anything that already exists,
because even the "verbs" in Topic Maps (such as "to
butter") are necessarily already "noun-ified" (such as
"the buttering") and are ready to be addressed as the
ends of additional arcs.  This has significant
advantages: it simplifies the process of amalgamating
facts and opinions when you can't know in advance which
things anyone will want to express a new fact or
opinion about.  If someone wants to say something about
"Tom"'s buttering of "the bread", there is guaranteed
to be something to which those remarks can be attached.

In RDF, we are not forced to create a "buttering event"
node in order to say "Tom buttered the bread".  We can
simply connect "Tom" to "the bread" directly.  This has
significant advantages if it can be accurately assumed
that nobody will need to say something about the
buttering: 

* There are many fewer nodes and arcs to worry about.

* Perhaps more significantly, verbs remain verbs.  Many
  people, especially computer jockeys who have not been
  steeped in the traditions of markup languages,
  application-independent information interchange and
  self-describing documents, are more comfortable with
  verbs (processes) than with nouns.  This is not a bad
  thing.  It is only the simple truth that, if you're
  focusing on implementing the application of butter to
  bread, it would only be distracting and annoying to
  try to provide for unanticipatable commentaries and
  constraints on specific "butterings".

RDF provides a process, called "reification", whereby
an arc can be alternatively represented as a node when
it is discovered that someone wants to say something
about it.  ("Reification" literally means
"thing-ification" or "noun-ification" -- transformation
into a thing.  The term "reification" is derived from
the Latin noun "res" (pronounced like "race"), which
means "thing".)  When Michael used Kate Hamilton (the
"buttering event") to be the surrogate of the arc
represented by the blue ribbon, he was reifying the
blue ribbon.  The arc became a node (and two new arcs).

In RDF, reification involves changing the graph that
results from processing interchangeable RDF statements.
In Topic Maps, however, everything is already reified.
No existing arcs need be changed when new information
comes along.  New arcs and nodes are added, and these
additions are the only changes that are required.  This
comparative changelessness can be extremely important.
If you find something in a graph, and you make a record
of the arcs you traversed in order to find it, you may
want to be able to use that same set of arcs to find
the same thing at some future date.  If some of those
arcs disappear, you may not be able to retrace your
steps.  If, on the other hand, the process of
reification does *not* cause the arcs whose functions
have been duplicated to disappear, then we have a
situation in which a considerable amount of redundant
information is contributing to our infoglut problem.
Either way, a policy of "late reification" (or maybe we
should call it "lazy reification") causes problems for
the usefulness of continuously-amalgamated knowledge.

Does this mean that I'm pro-Topic Maps and anti-RDF?
No, not at all!  These two paradigms have great need
for each other.

* RDF needs Topic Maps in order to make scalable
  management of knowledge emanating from disparate
  sources simple, practical and predictable.
  Enlightened self-interest dictates that the RDF camp
  consider Topic Maps as an important and basic RDF
  application,

* Topic Maps needs RDF in order to have a popular,
  widely-accepted basis upon which to describe exactly
  what a topic map means, in a fashion that will be
  immediately processable by a significant number of
  existing and well-funded tools.  The PMTM4 model is
  an example of a model of the meaning of Topic Maps
  that can easily be translated into RDF -- once and
  for all topic maps.

  If the PMTM4 model is adopted for this purpose, the
  corresponding RDF arcs will never need to be reified,
  even the very first time someone needs to make an
  assertion about a "buttering".

In the past, I myself have considered RDF as the
competitor of Topic Maps.  Happily, I was wrong -- at
least in fundamental technical terms.  Indeed, I now
believe that if there were no RDF, the Topic Maps camp
would have to invent something like it in order to make
the Maps paradigm predictably comprehensible by the
programmers who are pioneering the development of the
Internet.

There are other interesting comparisons to be made
between RDF and Topic Maps, but ever since Michael's
demonstration of the difference between early vs. late
(preemptive vs. lazy) reification, I have been meaning
to document both the difference and the demonstration.
Thanks for reading it.

-Steve

--
Steven R. Newcomb, Consultant
srn@coolheads.com

voice: +1 972 359 8160
fax:   +1 972 359 0270

1527 Northaven Drive
Allen, Texas 75002-1648 USA
Follow-Ups:
- [topicmaps-comment] Re: RDF/Topic Maps: late/lazy reification vs.early/preemptive reification
  - From: Piotr Kaminski <piotr@ideanest.com>
- Re: RDF/Topic Maps: late/lazy reification vs. early/preemptivereification(was: Re: [topicmaps-comment] RE: OASIS vs W3C)
  - From: Patrick Durusau <pdurusau@emory.edu>
References:
- Re: [topicmaps-comment] RE: OASIS vs W3C
  - From: Tony.Coates@reuters.com
- Re: [topicmaps-comment] RE: OASIS vs W3C
  - From: "Thomas B. Passin" <tpassin@home.com>
- Re: [topicmaps-comment] RE: OASIS vs W3C
  - From: Lars Marius Garshol <larsga@garshol.priv.no>