ws-tx message

Subject: Re: [ws-tx] Issue 052 - WS-AT: Replay message generates protocolerrors
From: Mark Little <mark.little@jboss.com>
To: Peter Furniss <peter.furniss@erebor.co.uk>
Date: Fri, 12 May 2006 12:36:03 +0100
Just to reiterate something I said in the first response to Alastair's 
email "to the TC" earlier in the week: I don't think it is desirable to 
abort transactions in situation b) below. It's interesting to note that 
it is "only" the state table that indicates rollback will occur in that 
situation: all other text in the specification is silent (and in fact 
tends to push the reader in the opposite direction of the state table).

Mark.


Peter Furniss wrote:
> The question as I see it is that the "semantics of Replay", which Mark
> said he wanted to leverage in an earlier message, seem to be only that
> it sabotages viable transactions in some circumstances (i.e. if received
> before the coordinator logs). The only other semantics, as
> Alastair is raising, seem to relate to connection-oriented and rpc
> worlds that WS-AT has left behind. 
>
>
> So,
> 	a) are there other "semantics of Replay" that are relevant to
> the send-all-messages-on-outbound-connections pattern of WS-AT ?
>
> 	b) why is it desirable to abort transactions when all the
> parties are able to commit and have informed the coordinator that they
> are able to commit ?
>
>
>
> Peter
>
> -----Original Message-----
> From: Mark Little [mailto:mark.little@jboss.com] 
> Sent: 12 May 2006 11:09
> To: Alastair Green
> Cc: Peter Furniss; Ram Jeyaraman; ws-tx@lists.oasis-open.org
> Subject: Re: [ws-tx] Issue 052 - WS-AT: Replay message generates
> protocol errors
>
>
>
> Alastair Green wrote:
>   
>> I'm sorry, but I don't get it.
>>
>> 1. Replay is never sent from the Coordinator to the Participant.
>>     
>
> I never said that, did I?
>   
>> 2. If the Coordinator never receives Prepared, it resends Prepare. If 
>> it never gets Committed back, it resends Commit (your scenario). In 
>> each case it does so  as often as it wants until it gets the *ed back.
>>     
>
> Sure. But that's top-down (coordinator driven recovery). I thought what
> we were discussing was bottom-up (participant driven recovery). Can you
> confirm that is your reading of the original issue too?
>
>   
>> 3. If the Participant fails and recovers, it knows that it may not 
>> have sent Prepared (it could fail between the log write and the 
>> message send), and must communicate the semantic "prepared". A message
>>     
>
>   
>> exists that carries exactly that semantic: Prepared.
>>     
>
> Or, it could send Replay ;-)?
>
>   
>> If the Participant tries to send Prepared (before or after crash 
>> recovery) and the message send fails to its knowledge (one 
>> interpretation of comms time out), it resends Prepared.
>>     
>
> Sure. No argument there: if it knows the Prepared failed to be 
> delivered, then it can obviously resend for an implementation 
> (potentially infinite) time. It could then periodically keep retrying. 
> Or, it could send Replay later.
>
>   
>> If the Participant never receives Commit or Rollback (another 
>> interpretation of comms time out), it again resends Prepared.
>>     
>
> Or Replay.
>
>   
>> In other words, the Participant sends and resends Prepared until it 
>> gets Commit or Rollback, across all failures and for all time.
>>
>> 4. The OTS replay_completion is not a precedent. OTS uses RPCs, not 
>> one-way messages. This makes retry behaviour more difficult to model. 
>> But if we strip that aside, we see that OTS does exactly the opposite 
>> of AT: it does not tolerate communications failure if the prepared 
>> semantic fails to get through, and it does not cause premature abort 
>> after a recoverable failure in the prepared state. In my view, both 
>> OTS and AT are wrong: *there is no reason to treat comms failure and 
>> crash recovery differently, either in mechanism of retrying or in 
>> effect on transaction outcome.*
>>
>> In OTS we say Vote vote = resource.prepare(), and the Vote enumeration
>>     
>
>   
>> tells us whether it's prepared, readonly or rollback. The operation is
>>     
>
>   
>> not idempotent -- a communications failure that prevents the vote 
>> returning will cause transaction abort. I think this is wrong and 
>> arbitrary, i.e it is a bad precedent and should not be copied. 
>> Correctly, AT does not copy this feature, and tolerates this failure 
>> (comms time out = resend Prepared).
>>     
>
> I think you're definitely misinterpreting my reference to 
> replay_completion: I'm talking only about the bottom-up recovery 
> scenario, which is exactly the same scenario this issue describes.
>   
>> If the participant fails in OTS then it can't tell when it failed (did
>>     
>
>   
>> it ever return from the prepare operation, i.e. send back the Vote?) 
>> So, it has to send a message to say: "I am prepared" 
>> (replay_completion), and it will receive a status. It may also get a 
>> replay of commit or rollback, as these operations can be duplicated 
>> (they are idempotent).
>>
>> replay_completion is defined as being "a hint to the coordinator" that
>>     
>
>   
>> the prepared participant has never received commit or rollback. As a 
>> hint it cannot affect the state or the behaviour of the coordinator, 
>> other than to stimulate a replay of commit or rollback, speeding 
>> things up. Its semantic is: "I am prepared". (The additional semantic 
>> "And once I failed" is irrelevant.). Correctly, in OTS replaying the 
>> prepared semantic never causes transaction abort, as it wrongly can in
>>     
> AT.
>
> I think you're mixing issues, which can only lead to confusion. Let's 
> keep this strictly at the issue in hand. It'll make it easier for 
> everyone else to follow.
>   
>> The only reason for the existence of replay_completion as a distinct 
>> operation is because you can't return the response/return value of an 
>> RPC twice.
>>
>> If OTS had modelled this using one ways, it would have ended up with 
>> two interfaces (simplified, and forgetting my IDL syntax, and changing
>>     
>
>   
>> the real names to save looking them up):
>>
>> interface coordinator
>> {
>>     void vote (in Vote); // Vote is an enum: Commit = Prepared, 
>> Readonly, Rollback
>> }
>>
>> interface resource
>> {
>>     void prepare();
>>     void commit();
>>     void rollback();
>> }
>>
>> Our failure scenario would then logically be:
>>
>> C invokes resource.prepare()
>> P invokes coordinator.vote (Vote.Commit)
>> P fails
>> P invokes coordinator.vote (Vote.Commit)
>>
>> In AT this appears as
>>
>> C sends Prepare
>> P sends Prepared
>> P fails
>> P resends Prepared
>>
>>  The separate message replay_completion is an artefact of RPC, not of 
>> the requirements of the transaction protocol.
>>
>> The correct behaviour for AT is to resend Prepared in the face of 
>> comms failures, and after crash recovery.
>>     
>
> I disagree. The correct behaviour is to send Replay.
>
> Mark.
>
>   
>> Alastair
>>
>> Mark Little wrote:
>>     
>>> Alastair Green wrote:
>>>       
>>>> Hi Mark,
>>>>
>>>> Just one point:
>>>>
>>>> Mark Little wrote:
>>>>         
>>>>> Since it crashed in Prepared Success state we should be able to 
>>>>> assume that the participant obeyed the rules and made its decision 
>>>>> to be able to commit durable. Hence, this Replay message should be 
>>>>> interpreted as a), though the semantic of "have recovered" 
>>>>> shouldn't exclude the fact that the failure may have been in the 
>>>>> network and not the participant service itself (for instance).
>>>>>
>>>>>           
>>>> One might think so, but in fact when the Participant experiences a 
>>>> comms time out it Resends Prepared (PV state table).
>>>>
>>>> Which begs the question: if that works for comm failures, why do we 
>>>> do something different for process failures which are recovered?
>>>>         
>>> Different type of failure. I interpret the resend of Prepared on 
>>> comms failure to be in the case where the sender (the participant) 
>>> knows that the original Prepared wasn't delivered. My original 
>>> statement above referring to comms failures is more: there was a 
>>> network partition after Prepared was successfully delivered and this 
>>> partition has been healed. In the meantime, the coordinator 
>>> committed, couldn't contact the participant because of the network 
>>> partition and so must go into some form of recovery mode. From the 
>>> coordinator's perspective, there is no way for it to distinguish 
>>> between a network partition and the failure of the machine on which 
>>> the participant resides. From the participants perspective, there is 
>>> a difference, though the resolution is the same: it initiates a 
>>> Replay message.
>>>
>>> I just wanted to make sure our definition of failure didn't preclude 
>>> partitions.
>>>       
>>>> The implication of the two events for the Coordinator, as you point 
>>>> out, should be identical (we are ensuring that the Prepared Success 
>>>> state is communicated to the Coordinator).
>>>>         
>>> But these are different scenarios. As a slight (related) aside: the 
>>> OTS works fine with replay_completion on the RecoveryCoordinator, so 
>>> there is precedent for Replay.
>>>
>>> Mark.
>>>
>>>       
>>>> Alastair
>>>>
>>>> Alastair
>>>>         
>
>
References:
- RE: [ws-tx] Issue 052 - WS-AT: Replay message generates protocol errors
  - From: "Peter Furniss" <peter.furniss@erebor.co.uk>