It looks like the group is the unit of work that takes a business
(trans)action from one stable state to the other, on success or business
(trans)action stays in the previous (or initial) stable state, on failure.
It only concerns with backward recovery, there is no forward recovery in
the group -like an ACID transaction. Although, it is implicit that the
operations in the group might be ACID transactions and/or other business
calculations, it is not clear whether
(1) the operations share data
and the structure of the group is visible to operations in and out of
group,
I'm fairly sure the
operations have to be identified as to which group they are part of, so in
that sense the structure of the group is visible. (they may carry other
identification, but at least the group)
Given
that, whether the data is shared is essentially decided by the
holder of the data (the participant side). Because the participant knows
the operations in the group will all get the same decision it is
inherently safe to share– multiple operations that will all-or-none
confirm can build on each others results, subject to the (local) constraint that they
are implemented such that they do not risk buried updates. (this is
essentially tight- or loose-coupling as described rather approximately in
XA).
Are you assuming all group
members are on the same machine/domain?
No, definitely
not. However, data-sharing is only imaginable between group members
do get performed on the same machine/domain (since the data they share is in
only "one place"). From the perspective of the participant that is wondering
whether it should let two operations share data, it doesn't matter who
else has operations that are part of the same group. (It will get affect
by those other participants via the voting and unanimity mechanisms of
the group, but that's different)
... snippage ...
3) It has a reverse
operation, i.e a compensation that it registered with group so that in
case of backward recovery it may be applied. (The compensation/reverse-op
may be
no-op.)
Yes - we had
perceived the compensation operation as being delegated to the
participant side, triggered on receipt of a (BTP) cancel message. This
then makes compensation a particular way of doing rollback (or
database-style rollback a particular way of doing compensation, with the
virtue of no collateral
damage).
So you're
assuming a model whereby failure has to cause some kind of compensation? Or
can we leave that to the application, i.e., a failure generates a "failure"
message, and how that is interpreted at the group/participant level is up to
the model (or participant). Interestingly, strictly speaking I can do
something similar in a transaction system: a participant receives a rollback
message from the TM and I would hope it does something to undo the work that
was done within the scope of the transaction to guarantee ACID properties.
However, it doesn't actually need to, i.e., I could build an "extended"
transaction model on top of a traditional transaction system by overloading
what it means to get a commit/rollback message.
I think "failure" in this environment doesn't
necessarily imply just transient process/communication failure, as it usually
does in classic TP systems - where there is a expected long duration, one
would imagine the components would persist their active-phase state
in various ways. Given that, I think failure does imply cancellation - if
the participant implements reversal/cancellation by compensation, then it
does.
Yes,
you can overload the commit/rollback - or you can regard the general case
of commit/rollback as being "finally do/finally undo", and the traditional
transaction system (or rather database) mechanisms are a particular way of
implementing that. (e.g.reinstating the before image is a kind of
"compensation" action, for which the locking makes sure that it will work,
without knowing exactly what the application was up to)
2) only allow reversible
operations
This is effectively assumed in what we've said, I
think. If a business process must involve non-reversible operations, then
it must, and it will have to handle the inappropriate performance of them
in some way. But that doesn't seem tractable to protocolising, in the
sense of an inter-party, inter-system protocol. Coping with the
irreversible may involve "compensation" in a general sense, but there will
normally be nothing to communicate that directly links the original and
balancing operations.
Just because an operation is compensatable
doesn't mean that it should be compensated. It may depend upon factors at
runtime. By the time a failure occurs, it may be entirely inappropriate to
compensate. Check out the share purchase/selling example in the HP proposal
for example.
That
kind of means the compensation algorithm is time-dependent - or what was a
reversible operation (within the protocol) has now become
irreversible.
Peter