[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [ws-tx] Issue 052 - WS-AT: Replay message generates protocolerrors
Ram, Absolutely. So if Replay is pure synonym for a resent Prepared (i.e. carries no additional semantic relevant to the outcome, e.g. does not imply that P is aborting) then there is no reason for C to abort the transaction on receiving the message. Put another way, there is no reason for Replay to induce a different behaviour from a resent Prepared (which would not cause abortion). If Replay/Preparing is corrected to become identical to Prepared/Preparing, then the unnecessary abort problem goes away (this issue). If the two rows become identical, then there is no need to have a separate Replay message (it is redundant) -- the related issue. Alastair Ram Jeyaraman wrote: Alastair, A participant that has successfully prepared when it sends out a replay (after a crash) genuinely wants to know what the outcome is, so it can complete the in-doubt transaction during its recovery. -----Original Message----- From: Alastair Green [mailto:alastair.green@choreology.com] Sent: Wednesday, May 10, 2006 4:15 AM To: Mark Little Cc: Peter Furniss; Ram Jeyaraman; ws-tx@lists.oasis-open.org Subject: Re: [ws-tx] Issue 052 - WS-AT: Replay message generates protocol errors Ram, Mark -- I've got a quite a few thoughts on this, but I want to check with the TC on a couple of premises, in case I am misunderstanding some unwritten piece of design intent. 1. The text at l.221 of the spec defines the Replay message thus: "Upon receipt of this notification, the coordinator may assume the participant has suffered a recoverable failure. It should resend the last appropriate protocol notification." Does a Replay message for a Participant that crashed in the Prepared Success and then recovered, carry the semantic: a) "Have recovered, am in good state to proceed, i.e. am still prepared", or b) "Have recovered, was prepared, but am now aborting", or c) "Have recovered, and may be prepared successfully, or may be aborting", or d) some other semantic, that I haven't thought of? 2. Is a Participant which crashed in the Prepared Success state, has recovered from a failure and is still prepared (i.e. is in the same state as it was prior to crash recovery) allowed to re-send Prepared? Or better, can its decision to do so damage the consistency of the transaction outcome, or slow down arriving at the outcome decision? Alastair Mark Little wrote:Peter Furniss wrote:I think it is likely the state table is being misinterpreted. I'm not sure by who :-) If you treat the state as referring to just one participant, youeitherget some very convoluted definitions of the internal events (c.f.issue048 - but more convoluted that the ones proposed there) or youviolateatomicity.We already agreed prior to the last f2f (in telecons) and at the last f2f (during the meeting) that the state table is not referring to justone participant.Receiving a 'Prepared' message doesn't move the state toPreparedSuccess- that's done by "Commit Decision", and until then 'Replay' wouldcausean abort. You could define "Commit Decision" as meaning "receipt ofokvote for just this one participant", and take the state for this participant to PreparedSuccess. But the only way to leave PreparedSuccess is from "WriteDone" or "WriteFailed". Since a'Aborted'from another participant should certainly cause this participant toberolled back, that 'Aborted' will have to trigger "WriteFailed", whichisnot an obvious interpretation. But I think this issue, with 053 (eliminate Replay) is more about whether Replay need ever force an abort. We may be looking at a carry-over from connection-centric protocols, where it made sense to force an abort if the connection broke before commit-time. In those worlds (more or less all transaction protocols that weren't using xml and/or web-services, I think), receipt of a recovery message beforetheconnection was observed to break could only mean the connection break was about to happen. But with WS-AT (especially because we have saidallmessages go on the underlying request) there is no connection to be monitored anyway. The coordinator hasn't noticed that participant was out of communication for a while, and now the participant says it is ready for the commit. Why *require* the coordinator to abort ?Agreed.Of course that's not to say the coordinator cannot *choose* to abortbyimplementation option if replay is received (or any othercircumstancethat leads the coordinator to suspect a failure somewhere). It can always do that if it hasn't progressed too far - it would appear inthetables as a User Rollback or Write Failed.Yes, I'd like to see this as an implementation specific choice. Mark.Peter -----Original Message----- From: Ram Jeyaraman [mailto:Ram.Jeyaraman@microsoft.com] Sent: 06 May2006 02:09 To: ws-tx@lists.oasis-open.org Subject: RE: [ws-tx] Issue 052 - WS-AT: Replay message generates protocol errors Section 10 (AT specification) states "These tables present the viewof acoordinator or participant with respect to a single partner". Thus,thecoordinator states correspond to interactions with a singleparticipant.The receipt of a participant vote "PreparedSuccess" triggers the coordinator state to "PreparedSuccess" with respect to thatparticularparticipant, even though the coordinator may not have completed the prepare phase for the rest of the participants. Is it possible that the state table is likely being misinterpreted? -----Original Message----- From: Ram Jeyaraman [mailto:Ram.Jeyaraman@microsoft.com] Sent: Thursday, April 06, 2006 10:50 AM To: ws-tx@lists.oasis-open.org Subject: [ws-tx] Issue 052 - WS-AT: Replay message generates protocol errors This is identified as WS-TX issue 052. Please ensure follow-ups have a subject line starting "Issue 052 - WS-AT: Replay message generates protocol errors ". -----Original Message----- From: Alastair Green [mailto:alastair.green@choreology.com] Sent: Wednesday, April 05, 2006 5:07 PM To: ws-tx@lists.oasis-open.org Subject: [ws-tx] New Issue: WS-AT: Replay message generates protocol errors Issue name -- WS-AT: Replay message generates protocol errors PLEASE DO NOT REPLY TO THIS EMAIL OR START A DISCUSSISON THREAD UNTIL THE ISSUE IS ASSIGNED A NUMBER. The issues coordinators will notify the list when that has occurred. Target document and draft: Protocol: WS-AT Artifact: spec Draft: WS-AT CD 0.1 uploaded Link to the document referenced:http://www.oasis-open.org/apps/org/workgroup/ws-tx/download.php/17325/wstx-wsat-1.1-spec-cd-01.pdf Section and PDF line number: Coordinator View State Table, after l. 503 Issue type: Design Related issues: New issue: WS-AT: Eliminate Replay message. New issue: WS-AT: Is logging mandatory? Issue Description: Replay reactions defined in current CV state table will cause unnecessary transaction aborts. Issue Details: The cells in row (Inbound Messages) Replay, columns (States) Active and Preparing read: Active: Send Rollback --> Aborting Preparing: Send Rollback --> Aborting Replay message means: "play it again Sam", not "demolish the piano". Case A. If the last thing they sent was Prepared, and it got through (we're Preparing and we've recorded their vote), and they've recovered, and they're waiting for a Commit or a Rollback, then we need to Ignore the Replay (just like if they send it when we've done our own housekeeping, and moved to Prepared Success). Case B. If the message didn't get through, and we've processed User Commit then we could be in the Preparing state, but have no record oftheir vote. In that case we'd have to replay Prepare to indicate to them, send us your vote again. Case C. If the last thing we received was Register, and we haven't processed User Commit, then we're still Active and they won't have logged. Replay won't happen on crash recovery (no log record to recover off), but it could be used to say to the coordinator "Are youstill there? Should I crap out?" (i.e., because of impatience). We can't stop them using Replay in that fashion. Our only sensible response would have to be: silence (we don't have a blank ack to a ping), i.e. to Ignore.There is no harm in them doing this, even though it is pointless. Youcould argue that this should be a N/A but that seems heavy-handed. Proposed Resolution: As the state tables do not differentiate between Preparing/no vote recorded and Preparing/vote recorded, it seems easiest to always resend Prepare in the Preparing state. Therefore: Replace the current text in the cells in row (Inbound Messages) Replay, columns (States) Active and Preparing with: Active: Ignore --> Active Preparing: Resend Prepare --> Preparing |
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]