OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

ws-tx message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: RE: [ws-tx] Issue 052 - WS-AT: Replay message generates protocol errors


Alastair,

A participant that has successfully prepared when it sends out a replay
(after a crash) genuinely wants to know what the outcome is, so it can
complete the in-doubt transaction during its recovery.

-----Original Message-----
From: Alastair Green [mailto:alastair.green@choreology.com] 
Sent: Wednesday, May 10, 2006 4:15 AM
To: Mark Little
Cc: Peter Furniss; Ram Jeyaraman; ws-tx@lists.oasis-open.org
Subject: Re: [ws-tx] Issue 052 - WS-AT: Replay message generates
protocol errors

Ram, Mark --

I've got a quite a few thoughts on this, but I want to check with the TC

on a couple of premises, in case I am misunderstanding some unwritten 
piece of design intent.

1. The text at l.221 of the spec defines the Replay message thus:

"Upon receipt of this notification, the coordinator may assume the 
participant has suffered a recoverable failure. It should resend the 
last appropriate protocol notification."

Does a Replay message for a Participant that crashed in the Prepared 
Success and then recovered, carry the semantic:

a) "Have recovered, am in good state to proceed, i.e. am still
prepared", or
b) "Have recovered, was prepared, but am now aborting", or
c) "Have recovered, and may be prepared successfully, or may be 
aborting", or
d) some other semantic, that I haven't thought of?

2. Is a Participant which crashed in the Prepared Success state, has 
recovered from a failure and is still prepared (i.e. is in the same 
state as it was prior to crash recovery) allowed to re-send Prepared? Or

better, can its decision to do so damage the consistency of the 
transaction outcome, or slow down arriving at the outcome decision?

Alastair


Mark Little wrote:
>
>
> Peter Furniss wrote:
>> I think it is likely the state table is being misinterpreted. I'm not
>> sure by who :-)
>>
>> If you treat the state as referring to just one participant, you
either
>> get some very convoluted definitions of the internal events (c.f.
issue
>> 048 - but more convoluted that the ones proposed there) or you
violate
>> atomicity.
>>   
>
> We already agreed prior to the last f2f (in telecons) and at the last 
> f2f (during the meeting) that the state table is not referring to just

> one participant.
>
>> Receiving a 'Prepared' message doesn't move the state to
PreparedSuccess
>> - that's done by "Commit Decision", and until then 'Replay' would
cause
>> an abort. You could define "Commit Decision" as meaning "receipt of
ok
>> vote for just this one participant", and take the state for this
>> participant to PreparedSuccess. But the only way to leave
>> PreparedSuccess is from "WriteDone" or "WriteFailed". Since a
'Aborted'
>> from another participant should certainly cause this participant to
be
>> rolled back, that 'Aborted' will have to trigger "WriteFailed", which
is
>> not an obvious interpretation.
>>
>>
>> But I think this issue, with 053 (eliminate Replay) is more about
>> whether Replay need ever force an abort. We may be looking at a
>> carry-over from connection-centric protocols, where it made sense to
>> force an abort if the connection broke before commit-time. In those
>> worlds (more or less all transaction protocols that weren't using xml
>> and/or web-services, I think), receipt of a recovery message before
the
>> connection was observed to break could only mean the connection break
>> was about to happen. But with WS-AT (especially because we have said
all
>> messages go on the underlying request) there is no connection to be
>> monitored anyway. The coordinator hasn't noticed that participant was
>> out of communication for a while, and now the participant says it is
>> ready for the commit. Why *require* the coordinator to abort ?
>>   
>
> Agreed.
>
>> Of course that's not to say the coordinator cannot *choose* to abort
by
>> implementation option if replay is received (or any other
circumstance
>> that leads the coordinator to suspect a failure somewhere). It can
>> always do that if it hasn't progressed too far - it would appear in
the
>> tables as a User Rollback or Write Failed.   
>
> Yes, I'd like to see this as an implementation specific choice.
>
> Mark.
>
>>
>> Peter
>>
>> -----Original Message-----
>> From: Ram Jeyaraman [mailto:Ram.Jeyaraman@microsoft.com] Sent: 06 May

>> 2006 02:09
>> To: ws-tx@lists.oasis-open.org
>> Subject: RE: [ws-tx] Issue 052 - WS-AT: Replay message generates
>> protocol errors
>> Section 10 (AT specification) states "These tables present the view
of a
>> coordinator or participant with respect to a single partner".  Thus,
the
>> coordinator states correspond to interactions with a single
participant.
>>
>> The receipt of a participant vote "PreparedSuccess" triggers the
>> coordinator state to "PreparedSuccess" with respect to that
particular
>> participant, even though the coordinator may not have completed the
>> prepare phase for the rest of the participants.
>>
>> Is it possible that the state table is likely being misinterpreted?
>>
>> -----Original Message-----
>> From: Ram Jeyaraman [mailto:Ram.Jeyaraman@microsoft.com]
>> Sent: Thursday, April 06, 2006 10:50 AM
>> To: ws-tx@lists.oasis-open.org
>> Subject: [ws-tx] Issue 052 - WS-AT: Replay message generates protocol
>> errors
>> This is identified as WS-TX issue 052.
>>
>> Please ensure follow-ups have a subject line starting "Issue 052 -
>> WS-AT: Replay message generates protocol errors ".
>>
>> -----Original Message-----
>> From: Alastair Green [mailto:alastair.green@choreology.com]
>> Sent: Wednesday, April 05, 2006 5:07 PM
>> To: ws-tx@lists.oasis-open.org
>> Subject: [ws-tx] New Issue: WS-AT: Replay message generates protocol
>> errors
>> Issue name -- WS-AT: Replay message generates protocol errors
>>
>> PLEASE DO NOT REPLY TO THIS EMAIL OR START A DISCUSSISON THREAD UNTIL
>> THE ISSUE IS ASSIGNED A NUMBER.
>>
>> The issues coordinators will notify the list when that has occurred.
>>
>> Target document and draft:
>>
>> Protocol:  WS-AT
>>
>> Artifact:  spec
>>
>> Draft:
>>
>> WS-AT CD 0.1 uploaded
>>
>> Link to the document referenced:
>>
>>
http://www.oasis-open.org/apps/org/workgroup/ws-tx/download.php/17325/ws
>> tx-wsat-1.1-spec-cd-01.pdf
>>
>> Section and PDF line number:
>>
>> Coordinator View State Table, after l. 503
>>
>>
>> Issue type:
>>
>> Design
>>
>>
>> Related issues:
>>
>> New issue: WS-AT: Eliminate Replay message. New issue: WS-AT: Is 
>> logging mandatory?
>>
>>
>> Issue Description:
>>
>> Replay reactions defined in current CV state table will cause
>> unnecessary transaction aborts.
>>  
>>
>> Issue Details:
>>
>> The cells in row (Inbound Messages) Replay, columns (States) Active 
>> and Preparing read:
>>
>> Active: Send Rollback --> Aborting
>> Preparing: Send Rollback --> Aborting
>>
>> Replay message means: "play it again Sam", not "demolish the piano".
>>
>> Case A. If the last thing they sent was Prepared, and it got through 
>> (we're Preparing and we've recorded their vote), and they've 
>> recovered, and they're waiting for a Commit or a Rollback, then we 
>> need to Ignore the Replay (just like if they send it when we've done 
>> our own housekeeping, and moved to Prepared Success).
>>
>> Case B. If the message didn't get through, and we've processed User 
>> Commit then we could be in the Preparing state, but have no record of

>> their vote. In that case we'd have to replay Prepare to indicate to 
>> them, send us your vote again.
>>
>> Case C. If the last thing we received was Register, and we haven't 
>> processed User Commit, then we're still Active and they won't have 
>> logged. Replay won't happen on crash recovery (no log record to 
>> recover off), but it could be used to say to the coordinator "Are you

>> still there? Should I crap out?" (i.e., because of impatience). We 
>> can't stop them using Replay in that fashion. Our only sensible 
>> response would have
>>
>> to be: silence (we don't have a blank ack to a ping), i.e. to Ignore.

>> There is no harm in them doing this, even though it is pointless. You

>> could argue that this should be a N/A but that seems heavy-handed.
>>
>>
>> Proposed Resolution:
>>
>> As the state tables do not differentiate between Preparing/no vote 
>> recorded and Preparing/vote recorded, it seems easiest to always 
>> resend Prepare in the Preparing state. Therefore:
>>
>> Replace the current text in the cells in row (Inbound Messages) 
>> Replay, columns (States) Active and Preparing with:
>>
>> Active: Ignore --> Active
>> Preparing: Resend Prepare --> Preparing
>>
>>
>>   
>


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]