topicmaps-comment message

Subject: [topicmaps-comment] Re: RDF/Topic Maps: late/lazy reification vs.early/preemptive reification

From: Piotr Kaminski <piotr@ideanest.com>
To: "Steven R. Newcomb" <srn@coolheads.com>, larsga@garshol.priv.no,sam_hunting@yahoo.com, cmsmcq@acm.org
Date: Fri, 28 Sep 2001 17:32:49 -0700

Steve:

Thanks for a clear and accurate description of reification in TM and RDF.
I agree with every technical point you've made.  However, I think your
analysis missed an important point:  reification mechanisms differ not
only in terms of early vs. late, but also in terms of the depth of
reification permitted.

Let me explain with the help of an example. [1]  Suppose we wish to model
a network of clubs and the membership of each.  The natural representation
in RDF [2] is to have a number of club and person resources, and attach
them appropriately with member-property arcs.  In TM, each club and person
would be a subject, and we'd have one membership association per club,
with the club playing the "organization" role, and the people playing the
"member" role (one player per member).

So far, so good.  However, the clubs are very exclusive, and membership
needs to be sponsored.  We, of course, wish to record the sponsor(s) of
each member of each club.  Sponsorship is not a property of the club
itself, since it has many members each with different sponsors.  It is
also not a property of the people themselves, since each is a member of
multiple clubs, probably with different sponsors.

In RDF, a natural way would be to reify each "member" statement, then
attach the statement resource to the sponsors with sponsor-property arcs.
This would unambiguously communicate the sponsors of each person's
membership in each club.

In TM, the situation is far trickier.  People play roles in a membership
association.  We'd now like to make statements about each "playing".
Unfortunately, TM doesn't preemptively reify this relationship, and does
not provide us with any special mechanism to do so. [3]

What can we do?  We can break apart the original membership association
into a number of binary membership associations, one per
member-organization pair.  I would argue that this is equivalent to
performing a (late) manual reification in RDF; we're splitting up an
existing association so that each "playing" relationship is represented by
a separate (preemptively) reified association.  We now have the same
problems:  either we delete the original association, or end up with
duplicated representations of the same underlying information.  These
issues were well explored in the original note, so I won't dwell on them
further here.

Let me now generalize from this slightly contrived example.  While TM
preemptively reifies all associations, it doesn't preemptively reify all
primitive relationships implicit in the model (e.g. the "playing"
relationships).  As long as we only want to make statements about the
first level of reified relationships (associations), everything's fine and
very convenient -- much more so than in RDF.  However, if we ever need to
make statements about the relationships that make up an association, TM
provides no mechanisms to do this transparently.

Moreover, even if TMs were to automatically reify the "playing"
relationships into associations, those associations would have their own
players, and hence their own "playing" relationships, ad infinitum.  It
may well be reasonable to limit the level to which reification can be
applied, playing the "80% rule".  However, it must then be acknowledged
that the modeling formalism is not universal, and will fail to
satisfactorily model some (hopefully small) fraction of real-world
situations.

Note that this problem doesn't exist in RDF, since we can keep reifying
the statements created in a manual reification.  In other words, RDF
doesn't impose an arbitrary limit on the depth of reification.

In summary, I think it's important to consider the "allowed depth" axis
when talking about reification mechanisms.  This axis is orthogonal to the
"late/early" axis, but has equally serious consequences for the resulting
models.  Right now, RDF is "late, arbitrary depth,"  TM is "early, fixed
depth".  I don't think there's any benefits to "late, fixed depth" (other
than extreme simplicity), and in my own work I'm exploring a metamodel
with an "early, arbitrary depth" reification mechanism.

        -- P.

[1] Unfortunately, I can't come up with a sensible variation on the
"buttering" example, fun as it is.
[2] Leaving aside the option of using containers, which wouldn't greatly
affect the following analysis other than adding an extra level on
indirection.
[3] As far as I know -- my argument rests on this point, so please correct
it if it's wrong.

--
  Piotr Kaminski <piotr@ideanest.com>  http://www.ideanest.com/
  "It's the heart afraid of breaking that never learns to dance."

Follow-Ups:
- Re: [topicmaps-comment] Re: RDF/Topic Maps: late/lazy reification vs.early/preemptive reification
  - From: "Steven R. Newcomb" <srn@coolheads.com>
- Re: [topicmaps-comment] Re: RDF/Topic Maps: late/lazy reification vs.early/preemptive reification
  - From: "Thomas B. Passin" <tpassin@home.com>

References:
- Re: [topicmaps-comment] RE: OASIS vs W3C
  - From: Tony.Coates@reuters.com
- Re: [topicmaps-comment] RE: OASIS vs W3C
  - From: "Thomas B. Passin" <tpassin@home.com>
- Re: [topicmaps-comment] RE: OASIS vs W3C
  - From: Lars Marius Garshol <larsga@garshol.priv.no>
- RDF/Topic Maps: late/lazy reification vs. early/preemptive reification(was: Re: [topicmaps-comment] RE: OASIS vs W3C)
  - From: "Steven R. Newcomb" <srn@coolheads.com>