xri message

Subject: [xri] [Glossary] Definition of "Resource" and "Attribute" (long)
From: Drummond Reed <Drummond.Reed@onename.com>
To: "XRI List (E-mail)" <xri@lists.oasis-open.org>
Date: Wed, 26 Feb 2003 23:21:58 -0800
[Ed. Note: One of the action items coming out of last week's F2F was the
completion of the XRI TC Glossary. Marc Le Maitre and I were assigned as
glossary editors. This is the first of several glossary-related threads -
we'll mark these threads with "[Glossary]" in the subject line for filtering
purposes.]

One open issue we did not have time to resolve at the F2F was the definition
of the term "attribute". After several hours thinking about this since the
F2F (some of us technical glossary wonks are truly twisted in this respect
;-), and after reviewing the URI spec (RFC 2396), I realized the source of
this problem was the definition we had agreed on for the term "resource".
This message: a) explains in some depth why I believe we need to clarify
this definition and b) proposes a new definition for "attribute" consistent
with this definition of "resource".

REVISITING OUR DEFINITION OF RESOURCE

At the F2F, we agreed to adopt the definition of "resource" supplied by the
URI spec. To wit:

         "A resource can be anything that has identity.  Familiar examples
include an electronic document, an image, a service (e.g., "today's weather
report for Los Angeles"), and a collection of other resources.  Not all
resources are network "retrievable"; e.g., human beings, corporations, and
bound books in a library can also be considered resources.
	The resource is the conceptual mapping to an entity or set of
entities, not necessarily the entity which corresponds to that mapping at
any particular instance in time.  Thus, a resource can remain constant even
when its content---the entities to which it currently corresponds---changes
over time, provided that the conceptual mapping is not changed in the
process."

So at the f2f agreed that a resource was simply "anything that has
identity". By this definition, data itself could be considered a resource
because you must be able to identify it or you can't use it.

By this definition, then, resources are infinitely recursive - a resource
can contain a resource can contain a resource right down to the smallest
identifiable unit of data - a single bit.

If you define a resource this broadly, then what we typically refer to as an
"attribute" - data whose purpose it is to describe another resource (a
person's hair color, a rock's weight, a book's page count) - must itself be
considered a resource.

If that's the case, we need another term that defines "that kind of resource
which is not an attribute". In other words, a term for the thing an
attribute describes - the person, the rock, or the book.

The classic computer science definition of this entity would be an "object".
(Immediately I shudder to think of having to include a definition of
"object" in our glossary.) But this introduces the question, "What is the
precise difference between an object and an attribute?" For example, an
object that stands alone in one context (say, a TelephoneNumber object that
contains the attributes CountryCode, AreaCode, Number, Extension) could be
an attribute in the context of another object (say, a Telephone). In fact,
object-oriented methodologies typically classify attributes into two types:
simple (attributes which do not contain other attributes) and complex
(attributes which themselves contain attributes). The latter is often
referred to as a data object. So are all attributes objects? All objects
attributes?

This grates against common sense because there is a class of data that is
clearly the "endpoint" or "primitive" describing other objects - data values
such as a person's hair color, a rock's weight, a book's page count. This
class of data is widely referred to as an "attribute". At the same time
there is another class of data (or data containers) that are the "entity
described by one or more attributes" - people, rocks, books. These are
widely referred to as "objects". And while an object can be a complex
attribute, it cannot be a simple attribute.

This all boils down to 3 levels (if this is starting to sound like
metaphysics, it's real close): 

#1) Objects that exist independently of any other object, in a global
context (call them independent objects).

#2) Objects that describe other objects, in a specific context (call them
complex attributes), and

#3) Pure attributes which only describe an object in a single context (call
them simple attributes).

Now the question is: are all three resources? With our original definition
of a resource as "anything that has identity", the answer might be yes. But
the more intuitive answer seems to be that "resource" only includes #1 -
objects that exist independently of any other object.

Why? My first argument would be that #2 and #3 don't actually fit the
defintion of "anything that has identity" because an attribute - either
complex or simple - exists *only in the context of the object it describes*.
Therefore it doesn't have it's own "identity". It exists only as part of the
identity of the object it describes.

Take, for example, a rock that weighs 3 pounds. This rock has identity, if
nothing other than the fact that it is the rock that weighs 3 pounds (out of
a pile of two rocks that weigh 3 and 5 pounds, respectively). But can you
say that "3 pounds" all by itself has identity? The NUMBERS and WORDS have
identity, but the actual value "3 pounds" is not ABOUT anything unless it is
put in the context of the rock it describes.

If you follow this to the logical extreme, it overturns our original
assumption that all data has identity. Rather only one class of data - that
which comprises an independent object - has identity. The other class of
data that only describes an object (an attribute, either complex or simple)
- does not have identity, because you literally can't "identify" it outside
of the context of the object it describes.

The only wrinkle in this definition is the middle case - level #2 - where an
object like a phone number exists as a complex attribute in the context of a
specific resource like a telephone but may also exist independently in a
global context and therefore be also considered a resource. In this case the
same object may, in one context, be an attribute, and in another context, a
resource. But that's okay, because our definition of resource would still
only cover the case where it was an object that exists independently of any
other object, and our definition of attribute would cover the case in which
it was relative to another resource.

Speaking of the term "relative", this analysis matches that of the URI spec
itself. Following is the definition of "fragment" in the spec:

"4.1. Fragment Identifier

	"When a URI reference is used to perform a retrieval action on the
identified resource, the optional fragment identifier, separated from the
URI by a crosshatch ("#") character, consists of additional reference
information to be interpreted by the user agent after the retrieval action
has been successfully completed.  As such, it is not part of a URI, but is
often used in conjunction with a URI.
	"The semantics of a fragment identifier is a property of the data
resulting from a retrieval action, regardless of the type of URI used in the
reference..."

By this it is clear that Berners-Lee et al meant that a "resource"
optionally contained additional data that was *not part of the URI*, i.e.,
was not the resource itself, but could be identified only in the context of
that resource. This makes sense because a fragment identifier, such as
"#Chapter-3", cannot be resolved outside of the context of the current
resource. To be literal, "#Chapter-3" is an identifier, but only a relative
identifier for an attribute that does not "have identity" by itself.

PROPOSED NEW DEFINITIONS OF RESOURCE AND ATTRIBUTE

As a result of this highly pedantic analysis (did I mention that precisely 7
angels can dance on the head of a pin?), following is the recommended
refinement of our definition of "resource" and the corresponding defnition
of "attribute":

RESOURCE: As defined in RFC 2396 (URIs): "anything that has identity". To be
specific, a resource is anything that can be identified independently of
another resource, i.e., in a global context, vs. only in the context of a
specific resource. For example, a car is a resource because it has identity
that exists independent of any other resource. However the color of a car is
not a resource because it cannot exist independently of the car itself.
Similarly, a document is a resource but "paragraph 3" of the document is not
a resource because it cannot exist independently of the document. Both the
color of a car and paragraph 3 of a document are attributes of a resource.
See ATTRIBUTE. [Continue with the rest of the current definition of
resource, which explains the 3 key types of resources from the standpoint of
identifiers: non-network resources, network resources, and resource
representations.]

ATTRIBUTE: Data associated with a resource that can be identified only in
the context of that resource. For example, "the color of my car" is an
attribute of the resource "car". "Paragraph 3 of Moby Dick" is an attribute
of the resource "Moby Dick". Note that the *concepts* of "color" or
"Paragraph 3" are themselves resources, because the definitions of these
concepts exist in a global context. However instances of those concepts must
describe a specific resource, and thus can only be an attribute. 

Note that an attribute may contain other attributes. This is called a
complex attribute, vs. one that doesn't contain other attributes, which is
called a simple attribute. For example, the resource "Telephone" may have
the complex attribute "Phone Number" which in turn contains the simple
attribute "Area Code". An attribute in a specific context may also be a
resource in a global context. For example, "Phone Number" is a complex
attribute of the resource "Telephone" but it may also be a resource that
exists independently of that telephone (for example, it may be reassigned to
a different telephone).

*****

As a final note, the reason for the length of this message is that the
fundamental difference between a resource and an attribute is reflected
directly in several of our core requirements, and I expect ultimately in our
syntax. So in the long run I believe a precise definition will save us lots
of time. As a friend of mine is found of quoting Alfred North Whitehead,
"How many arguments could have been avoided if only the participants had
bothered to define their terms."

Feeback gladly solicited.

=Drummond
Follow-Ups:
- RE: [xri] [Glossary] Definition of "Resource" and "Attribute" (long)
  - From: Bernard Vatant <bernard.vatant@mondeca.com>