OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

ws-tx message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [ws-tx] Issue 052 - WS-AT: Replay message generates protocolerrors




Alastair Green wrote:
> I'm sorry, but I don't get it.
>
> 1. Replay is never sent from the Coordinator to the Participant.

I never said that, did I?
>
> 2. If the Coordinator never receives Prepared, it resends Prepare. If 
> it never gets Committed back, it resends Commit (your scenario). In 
> each case it does so  as often as it wants until it gets the *ed back.

Sure. But that's top-down (coordinator driven recovery). I thought what 
we were discussing was bottom-up (participant driven recovery). Can you 
confirm that is your reading of the original issue too?

>
> 3. If the Participant fails and recovers, it knows that it may not 
> have sent Prepared (it could fail between the log write and the 
> message send), and must communicate the semantic "prepared". A message 
> exists that carries exactly that semantic: Prepared.

Or, it could send Replay ;-)?

>
> If the Participant tries to send Prepared (before or after crash 
> recovery) and the message send fails to its knowledge (one 
> interpretation of comms time out), it resends Prepared.

Sure. No argument there: if it knows the Prepared failed to be 
delivered, then it can obviously resend for an implementation 
(potentially infinite) time. It could then periodically keep retrying. 
Or, it could send Replay later.

>
> If the Participant never receives Commit or Rollback (another 
> interpretation of comms time out), it again resends Prepared.

Or Replay.

>
> In other words, the Participant sends and resends Prepared until it 
> gets Commit or Rollback, across all failures and for all time.
>
> 4. The OTS replay_completion is not a precedent. OTS uses RPCs, not 
> one-way messages. This makes retry behaviour more difficult to model. 
> But if we strip that aside, we see that OTS does exactly the opposite 
> of AT: it does not tolerate communications failure if the prepared 
> semantic fails to get through, and it does not cause premature abort 
> after a recoverable failure in the prepared state. In my view, both 
> OTS and AT are wrong: *there is no reason to treat comms failure and 
> crash recovery differently, either in mechanism of retrying or in 
> effect on transaction outcome.*
>
> In OTS we say Vote vote = resource.prepare(), and the Vote enumeration 
> tells us whether it's prepared, readonly or rollback. The operation is 
> not idempotent -- a communications failure that prevents the vote 
> returning will cause transaction abort. I think this is wrong and 
> arbitrary, i.e it is a bad precedent and should not be copied. 
> Correctly, AT does not copy this feature, and tolerates this failure 
> (comms time out = resend Prepared).

I think you're definitely misinterpreting my reference to 
replay_completion: I'm talking only about the bottom-up recovery 
scenario, which is exactly the same scenario this issue describes.
>
> If the participant fails in OTS then it can't tell when it failed (did 
> it ever return from the prepare operation, i.e. send back the Vote?) 
> So, it has to send a message to say: "I am prepared" 
> (replay_completion), and it will receive a status. It may also get a 
> replay of commit or rollback, as these operations can be duplicated 
> (they are idempotent).
>
> replay_completion is defined as being "a hint to the coordinator" that 
> the prepared participant has never received commit or rollback. As a 
> hint it cannot affect the state or the behaviour of the coordinator, 
> other than to stimulate a replay of commit or rollback, speeding 
> things up. Its semantic is: "I am prepared". (The additional semantic 
> "And once I failed" is irrelevant.). Correctly, in OTS replaying the 
> prepared semantic never causes transaction abort, as it wrongly can in AT.

I think you're mixing issues, which can only lead to confusion. Let's 
keep this strictly at the issue in hand. It'll make it easier for 
everyone else to follow.
>
> The only reason for the existence of replay_completion as a distinct 
> operation is because you can't return the response/return value of an 
> RPC twice.
>
> If OTS had modelled this using one ways, it would have ended up with 
> two interfaces (simplified, and forgetting my IDL syntax, and changing 
> the real names to save looking them up):
>
> interface coordinator
> {
>     void vote (in Vote); // Vote is an enum: Commit = Prepared, 
> Readonly, Rollback
> }
>
> interface resource
> {
>     void prepare();
>     void commit();
>     void rollback();
> }
>
> Our failure scenario would then logically be:
>
> C invokes resource.prepare()
> P invokes coordinator.vote (Vote.Commit)
> P fails
> P invokes coordinator.vote (Vote.Commit)
>
> In AT this appears as
>
> C sends Prepare
> P sends Prepared
> P fails
> P resends Prepared
>
>  The separate message replay_completion is an artefact of RPC, not of 
> the requirements of the transaction protocol.
>
> The correct behaviour for AT is to resend Prepared in the face of 
> comms failures, and after crash recovery.

I disagree. The correct behaviour is to send Replay.

Mark.

>
> Alastair
>
> Mark Little wrote:
>>
>>
>> Alastair Green wrote:
>>> Hi Mark,
>>>
>>> Just one point:
>>>
>>> Mark Little wrote:
>>>> Since it crashed in Prepared Success state we should be able to 
>>>> assume that the participant obeyed the rules and made its decision 
>>>> to be able to commit durable. Hence, this Replay message should be 
>>>> interpreted as a), though the semantic of "have recovered" 
>>>> shouldn't exclude the fact that the failure may have been in the 
>>>> network and not the participant service itself (for instance).
>>>>
>>> One might think so, but in fact when the Participant experiences a 
>>> comms time out it Resends Prepared (PV state table).
>>>
>>> Which begs the question: if that works for comm failures, why do we 
>>> do something different for process failures which are recovered?
>>
>> Different type of failure. I interpret the resend of Prepared on 
>> comms failure to be in the case where the sender (the participant) 
>> knows that the original Prepared wasn't delivered. My original 
>> statement above referring to comms failures is more: there was a 
>> network partition after Prepared was successfully delivered and this 
>> partition has been healed. In the meantime, the coordinator 
>> committed, couldn't contact the participant because of the network 
>> partition and so must go into some form of recovery mode. From the 
>> coordinator's perspective, there is no way for it to distinguish 
>> between a network partition and the failure of the machine on which 
>> the participant resides. From the participants perspective, there is 
>> a difference, though the resolution is the same: it initiates a 
>> Replay message.
>>
>> I just wanted to make sure our definition of failure didn't preclude 
>> partitions.
>>>
>>> The implication of the two events for the Coordinator, as you point 
>>> out, should be identical (we are ensuring that the Prepared Success 
>>> state is communicated to the Coordinator).
>>
>> But these are different scenarios. As a slight (related) aside: the 
>> OTS works fine with replay_completion on the RecoveryCoordinator, so 
>> there is precedent for Replay.
>>
>> Mark.
>>
>>>
>>> Alastair
>>>
>>> Alastair
>>


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]