uima message
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
| [List Home]
Subject: UIMA Abstract Interface Styles - Constant IDs or Non-constant IDs
- From: Adam Lally <alally@us.ibm.com>
- To: <uima@lists.oasis-open.org>, <carl.madson@sri.com>, <j.tsujii@manchester.ac.uk>, <Sophia.ananiadou@manchester.ac.uk>
- Date: Thu, 17 May 2007 09:54:20 -0400
Hi everyone,
It seems to me that there are three
basic styles of interface we could have to a UIMA Analytic Service:
1) CAS-in, delta out
The service doesn't respond with a CAS,
but instead a set of instructions for updating the input CAS. So
if an input CAS contained two Person objects with xmi:ids 1 and 2,
such as:
<xmi:XMI>
<example:Person xmi:id="1"
firstName="Joe"/>
<example:Person xmi:id="2"
firstName="Jane"/>
</xmi:XMI>
The service might respond with an instruction
(syntax TBD) saying: "Update the object with xmi:id 1 by setting firstName
= 'Bob'".
2) CAS-in, CAS-out, with constant
IDs
The service responds with an entire
CAS. If objects in both the input and output CASes have the same
xmi:id, they are considered to be the same object. Objects in the
output CAS that have new IDs are considered new objects. So in the
above example the service would respond with a CAS containing something
like:
<xmi:XMI>
<example:Person xmi:id="1"
firstName="Bob"/>
<example:Person xmi:id="2"
firstName="Jane"/>
</xmi:XMI>
3) CAS-in, CAS-out, without constant
IDs
The service responds with an entire
CAS, but the xmi:ids in this CAS bear no relation to the ids in the input
CAS, so in the simple example the following would be an acceptable response:
<xmi:XMI>
<example:Person xmi:id="1"
firstName="Jane"/>
<example:Person xmi:id="2"
firstName="Bob"/>
</xmi:XMI>
I believe that 1 and 2 are functionally
equivalent. In case #1 you could apply the deltas to the original
input CAS and will end up with the same CAS as would have been returned
by the service in case #2. Likewise it is easy to compare the input and
output CASes from case #2 and recover the delta. However in case
#3 there is no unique delta (there could have been be one modification,
or two, or perhaps a delete and a create occurred).
I realize that in this very simple example
it may not seem to matter whether you can compute a unique delta. However
with more realistic, large CASes, without deltas or constant IDs it becomes
much more difficult to recover the information about what operations a
particular service performed on the CAS.
There are at least two uses cases where
it is useful to know what operations a service performed on a CAS. One
is parallel processing, where we'd like to invoke multiple service on identical
copies of a CAS and then merge the results. Another is debugging.
It is very useful to be able to compare the CAS before and after
a service call, and discover what that service has changed in the CAS.
This is significantly easier to do with deltas or constant IDs than
with non-constant IDs.
-Adam
P.S. I investigated the possible
analogy of a CAS as an "object database", and I found this in
the manual for the Versant object database (http://www.versant.com/developer/resources/objectdatabase/documentation/database_fund_man.pdf):
"One of the strongest concepts
in object technology is object identity, because it makes possible such
features as persistent references to other objects and the ability to migrate
objects among distributed databases without having to change code that
accesses the objects. Versant assigns each persistent object a unique
identifier called its logical object identifier or loid. Logical
object identifiers are composed of two parts: a database identifier and
an object identifier."
So one thing this brings up is that
a "CAS ID" may be important for us to have (analogous to a database
identiifer). But my main point is about that first phrase - one of
the strongest concepts in object technology is object identity. I
think if UIMA is at all considered "object technology" then we
should have a clear notion of object identity.
_____________________________
Adam Lally
Advisory Software Engineer
UIMA Framework Lead Developer
IBM T.J. Watson Research Center
Hawthorne, NY, 10532
Tel: 914-784-7706, T/L: 863-7706
alally@us.ibm.com
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
| [List Home]