ws-tx message

Subject: RE: [ws-tx] Issue 052 - WS-AT: Replay message generates protocol errors
From: "Peter Furniss" <peter.furniss@erebor.co.uk>
To: "Mark Little" <mark.little@jboss.com>,"Alastair Green" <alastair.green@choreology.com>
Date: Fri, 12 May 2006 12:12:39 +0100
The question as I see it is that the "semantics of Replay", which Mark
said he wanted to leverage in an earlier message, seem to be only that
it sabotages viable transactions in some circumstances (i.e. if received
before the coordinator logs). The only other semantics, as
Alastair is raising, seem to relate to connection-oriented and rpc
worlds that WS-AT has left behind. 


So,
	a) are there other "semantics of Replay" that are relevant to
the send-all-messages-on-outbound-connections pattern of WS-AT ?

	b) why is it desirable to abort transactions when all the
parties are able to commit and have informed the coordinator that they
are able to commit ?



Peter

-----Original Message-----
From: Mark Little [mailto:mark.little@jboss.com] 
Sent: 12 May 2006 11:09
To: Alastair Green
Cc: Peter Furniss; Ram Jeyaraman; ws-tx@lists.oasis-open.org
Subject: Re: [ws-tx] Issue 052 - WS-AT: Replay message generates
protocol errors



Alastair Green wrote:
> I'm sorry, but I don't get it.
>
> 1. Replay is never sent from the Coordinator to the Participant.

I never said that, did I?
>
> 2. If the Coordinator never receives Prepared, it resends Prepare. If 
> it never gets Committed back, it resends Commit (your scenario). In 
> each case it does so  as often as it wants until it gets the *ed back.

Sure. But that's top-down (coordinator driven recovery). I thought what
we were discussing was bottom-up (participant driven recovery). Can you
confirm that is your reading of the original issue too?

>
> 3. If the Participant fails and recovers, it knows that it may not 
> have sent Prepared (it could fail between the log write and the 
> message send), and must communicate the semantic "prepared". A message

> exists that carries exactly that semantic: Prepared.

Or, it could send Replay ;-)?

>
> If the Participant tries to send Prepared (before or after crash 
> recovery) and the message send fails to its knowledge (one 
> interpretation of comms time out), it resends Prepared.

Sure. No argument there: if it knows the Prepared failed to be 
delivered, then it can obviously resend for an implementation 
(potentially infinite) time. It could then periodically keep retrying. 
Or, it could send Replay later.

>
> If the Participant never receives Commit or Rollback (another 
> interpretation of comms time out), it again resends Prepared.

Or Replay.

>
> In other words, the Participant sends and resends Prepared until it 
> gets Commit or Rollback, across all failures and for all time.
>
> 4. The OTS replay_completion is not a precedent. OTS uses RPCs, not 
> one-way messages. This makes retry behaviour more difficult to model. 
> But if we strip that aside, we see that OTS does exactly the opposite 
> of AT: it does not tolerate communications failure if the prepared 
> semantic fails to get through, and it does not cause premature abort 
> after a recoverable failure in the prepared state. In my view, both 
> OTS and AT are wrong: *there is no reason to treat comms failure and 
> crash recovery differently, either in mechanism of retrying or in 
> effect on transaction outcome.*
>
> In OTS we say Vote vote = resource.prepare(), and the Vote enumeration

> tells us whether it's prepared, readonly or rollback. The operation is

> not idempotent -- a communications failure that prevents the vote 
> returning will cause transaction abort. I think this is wrong and 
> arbitrary, i.e it is a bad precedent and should not be copied. 
> Correctly, AT does not copy this feature, and tolerates this failure 
> (comms time out = resend Prepared).

I think you're definitely misinterpreting my reference to 
replay_completion: I'm talking only about the bottom-up recovery 
scenario, which is exactly the same scenario this issue describes.
>
> If the participant fails in OTS then it can't tell when it failed (did

> it ever return from the prepare operation, i.e. send back the Vote?) 
> So, it has to send a message to say: "I am prepared" 
> (replay_completion), and it will receive a status. It may also get a 
> replay of commit or rollback, as these operations can be duplicated 
> (they are idempotent).
>
> replay_completion is defined as being "a hint to the coordinator" that

> the prepared participant has never received commit or rollback. As a 
> hint it cannot affect the state or the behaviour of the coordinator, 
> other than to stimulate a replay of commit or rollback, speeding 
> things up. Its semantic is: "I am prepared". (The additional semantic 
> "And once I failed" is irrelevant.). Correctly, in OTS replaying the 
> prepared semantic never causes transaction abort, as it wrongly can in
AT.

I think you're mixing issues, which can only lead to confusion. Let's 
keep this strictly at the issue in hand. It'll make it easier for 
everyone else to follow.
>
> The only reason for the existence of replay_completion as a distinct 
> operation is because you can't return the response/return value of an 
> RPC twice.
>
> If OTS had modelled this using one ways, it would have ended up with 
> two interfaces (simplified, and forgetting my IDL syntax, and changing

> the real names to save looking them up):
>
> interface coordinator
> {
>     void vote (in Vote); // Vote is an enum: Commit = Prepared, 
> Readonly, Rollback
> }
>
> interface resource
> {
>     void prepare();
>     void commit();
>     void rollback();
> }
>
> Our failure scenario would then logically be:
>
> C invokes resource.prepare()
> P invokes coordinator.vote (Vote.Commit)
> P fails
> P invokes coordinator.vote (Vote.Commit)
>
> In AT this appears as
>
> C sends Prepare
> P sends Prepared
> P fails
> P resends Prepared
>
>  The separate message replay_completion is an artefact of RPC, not of 
> the requirements of the transaction protocol.
>
> The correct behaviour for AT is to resend Prepared in the face of 
> comms failures, and after crash recovery.

I disagree. The correct behaviour is to send Replay.

Mark.

>
> Alastair
>
> Mark Little wrote:
>>
>>
>> Alastair Green wrote:
>>> Hi Mark,
>>>
>>> Just one point:
>>>
>>> Mark Little wrote:
>>>> Since it crashed in Prepared Success state we should be able to 
>>>> assume that the participant obeyed the rules and made its decision 
>>>> to be able to commit durable. Hence, this Replay message should be 
>>>> interpreted as a), though the semantic of "have recovered" 
>>>> shouldn't exclude the fact that the failure may have been in the 
>>>> network and not the participant service itself (for instance).
>>>>
>>> One might think so, but in fact when the Participant experiences a 
>>> comms time out it Resends Prepared (PV state table).
>>>
>>> Which begs the question: if that works for comm failures, why do we 
>>> do something different for process failures which are recovered?
>>
>> Different type of failure. I interpret the resend of Prepared on 
>> comms failure to be in the case where the sender (the participant) 
>> knows that the original Prepared wasn't delivered. My original 
>> statement above referring to comms failures is more: there was a 
>> network partition after Prepared was successfully delivered and this 
>> partition has been healed. In the meantime, the coordinator 
>> committed, couldn't contact the participant because of the network 
>> partition and so must go into some form of recovery mode. From the 
>> coordinator's perspective, there is no way for it to distinguish 
>> between a network partition and the failure of the machine on which 
>> the participant resides. From the participants perspective, there is 
>> a difference, though the resolution is the same: it initiates a 
>> Replay message.
>>
>> I just wanted to make sure our definition of failure didn't preclude 
>> partitions.
>>>
>>> The implication of the two events for the Coordinator, as you point 
>>> out, should be identical (we are ensuring that the Prepared Success 
>>> state is communicated to the Coordinator).
>>
>> But these are different scenarios. As a slight (related) aside: the 
>> OTS works fine with replay_completion on the RecoveryCoordinator, so 
>> there is precedent for Replay.
>>
>> Mark.
>>
>>>
>>> Alastair
>>>
>>> Alastair
>>
Follow-Ups:
- Re: [ws-tx] Issue 052 - WS-AT: Replay message generates protocolerrors
  - From: Mark Little <mark.little@jboss.com>