I think I prefer the conservative approach
that you highlight as one (a b c points) for 2 very different reasons:
First, On the technical side, I think that
the pure functional model is more complex to manage in a production system. If
you do load balancing, logs, debug, you need to be able to track Cases. Also a
source can generate several cases which are identical from a functional way but
are different because there are part of a sequence and represent just data with
duplicated content. Identifying the object with its content is not what we want
in this case and we will end up adding explicit ids to cases.
The second one is more on the political
side: The future success of the UIMA normalization is strongly linked to the
availability of a reference implementation. So on a practical point of view, we
should not walk away or do big philosophical changes from the existing basis. I
will put a higher priority to have a UIMA 1.0 proposition and a reference
implementation in sync, not to far from what we have today than to have a more
ambitious proposal that may require more time and effort to get everything in
sync within the community.
Pascal
From: David Ferrucci
[mailto:ferrucci@us.ibm.com]
Sent: Thursday, May 10, 2007 4:04
PM
To: Thilo W Goetz
Cc: Adam Lally;
carl.madson@sri.com; j.tsujii@manchester.ac.uk;
Sophia.ananiadou@manchester.ac.uk; uima@lists.oasis-open.org
Subject: Re: [uima]
"Assignment Style" vs. "Functional Style"
ok -- I guess it should be no surprise that an abstract
debate about merits of procedural, object and functional programming can
go on for a long time.
So
I will try to bring this to a concrete decision for the UIMA TC.
Do
we want TWO kinds of abstract interfaces such that (this is pretty-much what
we have proposed to date):
one
a.
is explicit that an input CAS has been updated and
b.
the updated CAS represent the same (input) artifact
c.
the SOFAs have not been changed in the CAS (it is representing the same logical
artifact)
and
the second
a.
is explicit that a new CAS (a derivative and different artifact) has been
created from the input CAS
b.
the SOFAs are potentially different and representing a different artifact
If
we do NOT want this, then I see at least three possible alternatives
1.
We have an explicit "functional model" -- all interfaces are assumed
to return a new CAS which may or may not represent the original artifact. If
this determination is required it will be left to the application.
2.
We have an explicit "assignment model" where all interfaces are
assumed to update the input CAS.
3.
We have an agnostic model -- where we are not explicit about what is happening
at an abstract level, and simply make lower-level commitments like -- XMI:ids
will not be maintained across service interfaces (whatever that may mean or not
mean to an application developer).
------------------------------------------------------------------------
David A. Ferrucci, PhD
Senior Manager, Semantic Analysis & Integration
Chief Architect, UIMA
IBM T.J. Watson
Research Center
19 Skyline Drive, Hawthorne, NY 10532
Tel: 914-784-7847, 8/863-7847
ferrucci@us.ibm.com
------------------------------------------------------------------------
http://www.ibm.com/research/uima
Thilo W Goetz
<TGOETZ@de.ibm.com>
05/10/2007 09:01 AM
|
To
|
Adam Lally/Watson/IBM@IBMUS
|
cc
|
carl.madson@sri.com,
j.tsujii@manchester.ac.uk, Sophia.ananiadou@manchester.ac.uk,
"uima@lists.oasis-open.org" <uima@lists.oasis-open.org>
|
Subject
|
Re: [uima] "Assignment Style" vs.
"Functional Style"
|
|
See inline comments below.
Mit freundlichen Gruessen / Best regards
Thilo Goetz
OmniFind & UIMA development
Information Management Division
IBM Germany
+49-7031-16-1758
IBM Deutschland Entwicklung GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter
Geschäftsführung: Herbert Kircher
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294
Adam Lally <alally@us.ibm.com> wrote on
05/09/2007 22:43:50:
> ...
> On the "non-enforceable" point, a
service that doesn't reuse any
> input IDs would be considered to have deleted
everything in the CAS
> and created a bunch of new stuff. In
itself that doesn't make it
> non-compliant with UIMA, but it would need to
declare this behavior
> in its metadata. A service which declares
that it "modifies
> instances of type Foo" would not be
allowed to change the xmi:id's
> on those instances or it would be considered
to not comply with its
> own behavioral metadata.
I expect the behavioral metadata to see just as
much use as our current input/output
type declarations. For most people it will
be too complicated to figure out, and
they'll just use the setting that allows them to
do whatever they like.
>
> However, perhaps this allows the service that
cannot guarantee a
> procedural behavior to still be
UIMA-compliant - it just has to
> declare its behavior appropriately.
(Another possibility is that
> such a service is a CAS Multiplier. CAS
Multipliers are expected to
> create completely new CASes and so might be a
natural fit for this
> kind of service.)
>
> Also I think if we say that there is no
notion that an annotation is
> an object, then the TC needs to go back and
revisit the earlier
> sections of the whitepaper which explicitly
say that the CAS is an
> object graph, and revisit our decisions to
use OMG standards which
> are fundamentally object-based.
Indeed. The object analogy is fine as long
as it's useful. If and when
we take it too far, we shoot ourselves in the
foot. If we can use OMG tools
and standards because they do what we need, that's
great. If they make
us change the way we think about CAS data, we need
to consider carefully
if the benefit is worth it. You know my
opinion. See my "multipleReferencesAllowed"
pet peeve.
>
> Regards,
> -Adam
> _____________________________
> Adam Lally
> Advisory Software Engineer
> UIMA Framework Lead Developer
> IBM
T.J. Watson
Research Center
> Hawthorne,
NY, 10532
> Tel: 914-784-7706, T/L: 863-7706
> alally@us.ibm.com
>