RE: [uima] "Assignment Style" vs. "Functional Style"

I think I prefer the conservative approach that you highlight as one (a b c points) for 2 very different reasons:

First, On the technical side, I think that the pure functional model is more complex to manage in a production system. If you do load balancing, logs, debug, you need to be able to track Cases. Also a source can generate several cases which are identical from a functional way but are different because there are part of a sequence and represent just data with duplicated content. Identifying the object with its content is not what we want in this case and we will end up adding explicit ids to cases.

The second one is more on the political side: The future success of the UIMA normalization is strongly linked to the availability of a reference implementation. So on a practical point of view, we should not walk away or do big philosophical changes from the existing basis. I will put a higher priority to have a UIMA 1.0 proposition and a reference implementation in sync, not to far from what we have today than to have a more ambitious proposal that may require more time and effort to get everything in sync within the community.

Pascal

From: David Ferrucci [mailto:ferrucci@us.ibm.com]
Sent: Thursday, May 10, 2007 4:04 PM
To: Thilo W Goetz
Cc: Adam Lally; carl.madson@sri.com; j.tsujii@manchester.ac.uk; Sophia.ananiadou@manchester.ac.uk; uima@lists.oasis-open.org
Subject: Re: [uima] "Assignment Style" vs. "Functional Style"

ok -- I guess it should be no surprise that an abstract debate about merits of procedural, object and functional programming can go on for a long time.

So I will try to bring this to a concrete decision for the UIMA TC.

Do we want TWO kinds of abstract interfaces such that (this is pretty-much what we have proposed to date):

one

a. is explicit that an input CAS has been updated and
b. the updated CAS represent the same (input) artifact
c. the SOFAs have not been changed in the CAS (it is representing the same logical artifact)

and the second

a. is explicit that a new CAS (a derivative and different artifact) has been created from the input CAS
b. the SOFAs are potentially different and representing a different artifact

If we do NOT want this, then I see at least three possible alternatives

1. We have an explicit "functional model" -- all interfaces are assumed to return a new CAS which may or may not represent the original artifact. If this determination is required it will be left to the application.

2. We have an explicit "assignment model" where all interfaces are assumed to update the input CAS.

3. We have an agnostic model -- where we are not explicit about what is happening at an abstract level, and simply make lower-level commitments like -- XMI:ids will not be maintained across service interfaces (whatever that may mean or not mean to an application developer).

------------------------------------------------------------------------
David A. Ferrucci, PhD
Senior Manager, Semantic Analysis & Integration
Chief Architect, UIMA
IBM T.J. Watson Research Center
19 Skyline Drive, Hawthorne, NY 10532
Tel: 914-784-7847, 8/863-7847
ferrucci@us.ibm.com
------------------------------------------------------------------------
http://www.ibm.com/research/uima

Thilo W Goetz <TGOETZ@de.ibm.com>

05/10/2007 09:01 AM

To	Adam Lally/Watson/IBM@IBMUS
cc	carl.madson@sri.com, j.tsujii@manchester.ac.uk, Sophia.ananiadou@manchester.ac.uk, "uima@lists.oasis-open.org" <uima@lists.oasis-open.org>
Subject	Re: [uima] "Assignment Style" vs. "Functional Style"

See inline comments below.

Mit freundlichen Gruessen / Best regards

Thilo Goetz
OmniFind & UIMA development
Information Management Division
IBM Germany
+49-7031-16-1758

IBM Deutschland Entwicklung GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter
Geschäftsführung: Herbert Kircher
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294

Adam Lally <alally@us.ibm.com> wrote on 05/09/2007 22:43:50:

> ...
> On the "non-enforceable" point, a service that doesn't reuse any
> input IDs would be considered to have deleted everything in the CAS
> and created a bunch of new stuff. In itself that doesn't make it
> non-compliant with UIMA, but it would need to declare this behavior
> in its metadata. A service which declares that it "modifies
> instances of type Foo" would not be allowed to change the xmi:id's
> on those instances or it would be considered to not comply with its
> own behavioral metadata.

I expect the behavioral metadata to see just as much use as our current input/output
type declarations. For most people it will be too complicated to figure out, and
they'll just use the setting that allows them to do whatever they like.

>
> However, perhaps this allows the service that cannot guarantee a
> procedural behavior to still be UIMA-compliant - it just has to
> declare its behavior appropriately. (Another possibility is that
> such a service is a CAS Multiplier. CAS Multipliers are expected to
> create completely new CASes and so might be a natural fit for this
> kind of service.)
>
> Also I think if we say that there is no notion that an annotation is
> an object, then the TC needs to go back and revisit the earlier
> sections of the whitepaper which explicitly say that the CAS is an
> object graph, and revisit our decisions to use OMG standards which
> are fundamentally object-based.

Indeed. The object analogy is fine as long as it's useful. If and when
we take it too far, we shoot ourselves in the foot. If we can use OMG tools
and standards because they do what we need, that's great. If they make
us change the way we think about CAS data, we need to consider carefully
if the benefit is worth it. You know my opinion. See my "multipleReferencesAllowed"
pet peeve.

>
> Regards,
> -Adam
> _____________________________
> Adam Lally
> Advisory Software Engineer
> UIMA Framework Lead Developer
> IBM T.J. Watson Research Center
> Hawthorne, NY, 10532
> Tel: 914-784-7706, T/L: 863-7706
> alally@us.ibm.com
>

uima message