[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [ws-tx] Issue 052 - WS-AT: Replay message generates protocolerrors
I'm sorry, but I don't get it. 1. Replay is never sent from the Coordinator to the Participant. 2. If the Coordinator never receives Prepared, it resends Prepare. If it never gets Committed back, it resends Commit (your scenario). In each case it does so as often as it wants until it gets the *ed back. 3. If the Participant fails and recovers, it knows that it may not have sent Prepared (it could fail between the log write and the message send), and must communicate the semantic "prepared". A message exists that carries exactly that semantic: Prepared. If the Participant tries to send Prepared (before or after crash recovery) and the message send fails to its knowledge (one interpretation of comms time out), it resends Prepared. If the Participant never receives Commit or Rollback (another interpretation of comms time out), it again resends Prepared. In other words, the Participant sends and resends Prepared until it gets Commit or Rollback, across all failures and for all time. 4. The OTS replay_completion is not a precedent. OTS uses RPCs, not one-way messages. This makes retry behaviour more difficult to model. But if we strip that aside, we see that OTS does exactly the opposite of AT: it does not tolerate communications failure if the prepared semantic fails to get through, and it does not cause premature abort after a recoverable failure in the prepared state. In my view, both OTS and AT are wrong: there is no reason to treat comms failure and crash recovery differently, either in mechanism of retrying or in effect on transaction outcome. In OTS we say Vote vote = resource.prepare(), and the Vote enumeration tells us whether it's prepared, readonly or rollback. The operation is not idempotent -- a communications failure that prevents the vote returning will cause transaction abort. I think this is wrong and arbitrary, i.e it is a bad precedent and should not be copied. Correctly, AT does not copy this feature, and tolerates this failure (comms time out = resend Prepared). If the participant fails in OTS then it can't tell when it failed (did it ever return from the prepare operation, i.e. send back the Vote?) So, it has to send a message to say: "I am prepared" (replay_completion), and it will receive a status. It may also get a replay of commit or rollback, as these operations can be duplicated (they are idempotent). replay_completion is defined as being "a hint to the coordinator" that the prepared participant has never received commit or rollback. As a hint it cannot affect the state or the behaviour of the coordinator, other than to stimulate a replay of commit or rollback, speeding things up. Its semantic is: "I am prepared". (The additional semantic "And once I failed" is irrelevant.). Correctly, in OTS replaying the prepared semantic never causes transaction abort, as it wrongly can in AT. The only reason for the existence of replay_completion as a distinct operation is because you can't return the response/return value of an RPC twice. If OTS had modelled this using one ways, it would have ended up with two interfaces (simplified, and forgetting my IDL syntax, and changing the real names to save looking them up): interface coordinator { void vote (in Vote); // Vote is an enum: Commit = Prepared, Readonly, Rollback } interface resource { void prepare(); void commit(); void rollback(); } Our failure scenario would then logically be: C invokes resource.prepare() P invokes coordinator.vote (Vote.Commit) P fails P invokes coordinator.vote (Vote.Commit) In AT this appears as C sends Prepare P sends Prepared P fails P resends Prepared The separate message replay_completion is an artefact of RPC, not of the requirements of the transaction protocol. The correct behaviour for AT is to resend Prepared in the face of comms failures, and after crash recovery. Alastair Mark Little wrote:
|
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]