OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

bt-models message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]

Subject: two-phase protocol

Following on from the current position that all participants within a BT must offer a two-phase interface, let's assume the methods are:
I'd suggest something like this rather than prepare/commit/rollback which are overloaded and could imply an implementation. As we mentioned yesterday, we may want the prepareToComplete method to be paramaterised with time at least.
Return values? OK, Fail, Ignore from prepareToComplete (c.f. VoteCommit, VoteRollback, VoteReadOnly in the OTS). Since heuristics are inevitable, complete and undo will potentially need to return some indication of this fact. However, I think that the underlying notion of a contract violation can be more useful as a basis on which to build heuristics, since contract violation may well come into other aspects of a BT, e.g., security, audit trails, etc. A heuristic is simply a violation of the strict two-phase protocol. So, we could have a base error/exception of ContractViolation, subtyped by potential heuristic outcomes.
However, what if there is only a single participant in the cohesion when it completes? Do we allow one-phase optimisations? Possibly, and then we'd need a onePhaseComplete. Rather this than allow complete to be called without a preceding prepareToComplete.
Then there's the issue of recovery. What does a participant do to ensure that it can be driven in the event of a coordinator failure? What does a coordinator do to ensure it can drive participants in the event it fails and in the event a participant fails?
Is there an equivalent of a presumed abort protocol for business transactions, i.e., as long as a participant has not received a prepareToComplete it can unilateraly undo and know that that will be the outcome of the cohesion? I think this has a place in the loosely coupled BT world, and would like to know what others think.
Given the above, the coordinator (conductor in last night's teleconference) must at least:
(i) remember (i.e., make persistent) the participants
(ii) remember any associated leases (when a participant enlists it may specify a time period within which it must be told to prepareToComplete or it will unilateraly undo)
(iii) remember the termination decision in the case of complete. As in a transaction system, if the completion protocol begins the conductor can remove any participant who responded OK to the commit request, evetually removing the entry entirely from durable storage.
The participant needs to (durably) remember:
(i) that it was asked to prepareToComplete and hasn't received a complete/undo message.
(ii) any lease it may have set for itself so that it can then unilaterally undo
(iii) shadow state changes/locks associated with the current cohesion id.
Some outstanding issues I haven't had a chance to think through yet:
(a) what if the participant goes away after complete but before the response gets back to the conductor (e.g., the conductor crashes)? The conductor will keep trying to complete the participant but no knowledge of it may exist at the web services site.
(b) how long should a conductor keep trying to complete a cohesion for, especially if no lease times have been specified? Should there be a default (vendor conductor specific) timeout value, after which no cohesion completion (or undo) will be attempted and someone else will have to figure things out?
(c) what happens if the undo operation of a prepared participant continues to fail?
All the best,
Dr. Mark Little (mark@arjuna.com)
Transactions Architect, HP Arjuna Labs
Phone +44 191 2064538
Fax   +44 191 2064203

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]

Powered by eList eXpress LLC