[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [ws-tx] Issue 039 - WS-AT: Coordinator should not distinguishprotocol of orphaned participants
Ram, I don't find this argument convincing. You say: "The superior clearly holds a responsibility in the 2PC protocol to tell the participant about the outcome"If you rephrased that to say: "to tell the participant about the outcome at least once" then that would be true. As phrased, in some recovery scenarios it cannot be true: the knowledge of the outcome has crept into P-side records, and has departed from C's memory. When C receives Prepared/Replay in the None state it does not know what the outcome was. It has literally forgotten. It cannot communicate the outcome to P. It can only communicate the semantic: "I don't know the outcome". This semantic may be sent in circumstances where the outcome was Commit, and not just when it was Rollback. This flows from the lazy log delete in the PV tables. (This is an important extra element which changes, but strengthens, Peter's arguments in an earlier post.) The message that represents that semantic should therefore not be called Rollback, it should be called UnknownOutcome, whether volatile or durable. P may be able to deduce the outcome, given that information, and knowledge of the state of the persistent work that it controls. It will be able to guarantee that persistent data resources are correctly modified, even if it is unaware of the outcome. Let's take the durable case. C can be in state None because the transaction committed or because it rolled back. C gets into None by virtue of receiving Aborted or Committed. Both of those messages are sent by P before the P Forget is initiated, yet alone completed. P may send the message and fail, either after sending Aborted or Committed (see my discussion of potential bug under aegis of Issue 048). It may therefore recover and replay Prepared/send Replay in either case. The knowledge of the final outcome may rest with P, or a combination of P and C, but never with C alone. Sometimes it is never known. Recall that the PV state machine interacts with an unspecified entity to Initiate Commit Decision, or to Initiate Rollback: I've called that entity the Application Entity or AE). AE could be a client to a presumed abort DBMS. If the client has forgotten the transaction then the P can assume that the rollback or commit was successfully processed, prior to P failure. This could be established with a query if the client supports enquiry of known transactions. In this case there is no need for P to talk to C, and P, like C, has no idea what the outcome actually was. If the client cannot be queried, then the DBMS client is either finished, or prepared. P cannot know the outcome, because the transaction may not yet have completed (no Commit or Rollback yet received), so it has to replay Prepared (or send Replay). There are three answers that C can give (that actually reflect its knowledge): unknown, commit, rollback. C does not know that unknown equals rollback. P asks C what to do, and C says: "I don't know". At that point P attempts rollback, because that is always safe if the outcome is unknown. If the transaction completed then the client will say "transaction unknown" (it either committed or rolledback, and is unknown); or it will will respond OK to the rollback. This is why it is reasonable that the action on receipt of Unknown Transaction is Initiate Rollback. This does not make it reasonable for C to send Rollback. So, in this case P may become aware of the outcome, but only because of knowledge lodged in the AE (the database). What happens if the AE is not a presumed-abort database? If it is a the top half of an interposed coordinator then we can assume that the state of the AE (the sub-coordinator) is known to it. If the decision being processed was Commit, then the message Committed will not travel to the superior coordinator until the sub-coordinator has logged the Commit decision (that is the work of the AE in this case -- assuming that we follow the log-sub-coordinator-commit rule implied by the state tables). If P recovers then it can enquire of the sub-coordinator if it has started commit processing, and will only contact the C if that is not true. Once again, the state of the AE determines the action of P. In this case P does know that "unknown transaction" means that the transaction has rolledback. If P is working with other types of resource, then we can assume that either they behave similarly to presumed-abort database clients, or that they have logged the decision they are working on as part of their AE work, like the sub-coordinator. C in the None state, far from knowing the outcome of the transaction, is in fact unable to know the outcome of the transaction even if contacted by P with a Prepared/Replay. P may contact C in that state, both if the outcome was committed and if it was aborted. P may deduce the outcome, but that is not guaranteed (or important). What is guaranteed is that the AE always gets the correct outcome, irrespective of failures. Alastair Ram Jeyaraman wrote: Mark, As I pointed to earlier, a superior coordinator holds a responsibility to provide correct information about the outcome. Stating "I don't know" and letting the participant make a unilateral conclusion about the outcome is not a desirable solution. This leads to varied interpretations and may cause inconsistent information to be reported to higher layers. The fact that a participant is requesting replay means that the participant does not know what to do. Replying 'I don't know' does not really help the participant. The superior clearly holds a responsibility in the 2PC protocol to tell the participant about the outcome. It is very important not to take away that simple, yet powerful, assumption. -----Original Message----- From: Mark Little [mailto:mark.little@jboss.com] Sent: Saturday, May 13, 2006 1:57 AM To: Ram Jeyaraman Cc: Peter Furniss; ws-tx@lists.oasis-open.org Subject: Re: [ws-tx] Issue 039 - WS-AT: Coordinator should not distinguish protocol of orphaned participants As you can see from previous emails, I think we shouldn't mandate that EPRs for Volatile or Durable participants are different, but push the intelligence back to the participant, which already knows what type it is (and hence how to interpret the message). This does not preclude implementations from have different EPRs per protocol, but it is not required. Yes, we'll need to introduce an UnknownTransaction fault, but I think we need that anyway. Mark. Ram Jeyaraman wrote:Peter, I am sure you wouldn't disagree: A superior coordinator holds a responsibility to provide correct information about the outcome. Stating "I don't know" and letting the subordinate make a unilateral conclusion about the outcome is not a desirable situation. This gives room for varied interpretations of "I don't know", which is not desirable. As transaction middleware providers we should attempt to provide the best information possible, and it is critical that we build that assumption into the protocol. For durable participants, I strongly support Option A, you had proposed earlier. In the case of volatile participants, I agree that a more meaningful fault, such as "UnknownTransaction" or perhaps "InDoubt", is better than "InvalidState". Thank you.------------------------------------------------------------------------*From:* Peter Furniss [mailto:peter.furniss@erebor.co.uk] *Sent:* Wednesday, May 10, 2006 11:52 PM *To:* Ram Jeyaraman; ws-tx@lists.oasis-open.org *Subject:* RE: [ws-tx] Issue 039 - WS-AT: Coordinator should not distinguish protocol of orphaned participants Ram, It isn't a question of a more meaningful message, but of which side isrequired to distinguish what the now-broken relationship was: a) the side that now knows nothing of the relationship and therefore has to juggle its own addressing at implementation time so it can infer the answer b) the side that does know of the relationship and needs to behave differently Peter------------------------------------------------------------------------*From:* Ram Jeyaraman [mailto:Ram.Jeyaraman@microsoft.com] *Sent:* 11 May 2006 03:34 *To:* Peter Furniss; ws-tx@lists.oasis-open.org *Subject:* RE: [ws-tx] Issue 039 - WS-AT: Coordinator should not distinguish protocol of orphaned participants Peter, It is desirable to send a participant a more meaningful message. To that end, option (a) you have described below seems quite reasonable.------------------------------------------------------------------------*From:* Peter Furniss [mailto:peter.furniss@erebor.co.uk] *Sent:* Saturday, May 06, 2006 4:02 AM *To:* Ram Jeyaraman; ws-tx@lists.oasis-open.org *Subject:* RE: [ws-tx] Issue 039 - WS-AT: Coordinator should not distinguish protocol of orphaned participants But there is no need for a coordinator in None to distinguish. In any state other than None, the coordinator of course knows which protocol is in use, because it has an established relationship (and a state-referencing endpoint) from the Register. So considerations there(i.e. the other state transitions) are irrelevant to this matter. In state None, the coordinator has no such relationship. It doesn't even really have a state-referencing endpoint - there's just some catcher for incoming messages that target endpoints that don't exist (e.g. the despatcher triggers its else/default actions). It doesn't care. It isn't going to change state. It isn't going to do anything about the message except reply to it. The only requirement on our protocol is that the participant (which obviously does know what it registered for, and does care) is not mislead. And, as explained in the issue, a participant in Prepared that is told the coordinator is in state None does different things depending on whether it was durable or volatile. There are two solutions: a) we require that there be something in, associated with, enveloping or otherwise marking the Prepared message such that the coordinator can determine what kind of participant sent it, and send back different signals. We don't need to mandate a particular means, but there must be something on the wire that distinguishes a Prepared fromvolatile with Prepared from durable b) a coordinator in state None sends back a reply to Prepared from anykind of participant, the reply distinguishably telling the participantthe coordinator is in None, and the participant interprets this appropriately dependent on what kind it is. Replying "UnknownTransaction", regardless of protocol, simplifies the protocol. Coordinator implementations don't need to do anythingspecial.Peter------------------------------------------------------------------------*From:* Ram Jeyaraman [mailto:Ram.Jeyaraman@microsoft.com] *Sent:* 06 May 2006 03:07 *To:* ws-tx@lists.oasis-open.org *Subject:* RE: [ws-tx] Issue 039 - WS-AT: Coordinator should not distinguish protocol of orphaned participants This issue poses a valid question: How to handle the receipt of a 'Prepared' message while the coordinator state machine is in 'None'state.But in that question is the answer: The state transitions imply that an implementation should distinguish between volatile and durable participants; for example, as outlined in this issue description. Thatis, an implementation is expected to achieve this in an implementationspecific way. Mandating a specific mechanism is unnecessary since this can be achieved via a suitable mechanism that works best for an implementation. Further, we certainly want implementations to distinguish between volatile and durable participants, so that messages from durable participants can be meaningfully handled.------------------------------------------------------------------------*From:* Ram Jeyaraman [mailto:Ram.Jeyaraman@microsoft.com] *Sent:* Tuesday, March 28, 2006 10:22 AM *To:* ws-tx@lists.oasis-open.org *Subject:* [ws-tx] Issue 039 - WS-AT: Coordinator should not distinguish protocol of orphaned participants This is identified as WS-TX issue 039. Please ensure follow-ups have a subject line starting "Issue 039 - WS-AT: Coordinator should not distinguish protocol of orphaned participants".------------------------------------------------------------------------*From:* Peter Furniss [mailto:peter.furniss@erebor.co.uk] *Sent:* Monday, March 27, 2006 1:27 PM *To:* ws-tx@lists.oasis-open.org *Subject:* [ws-tx] WS-AT: Coordinator should not distinguish protocol of orphaned participants Issue name -- WS-AT: Coordinator should not distinguish protocol of orphaned participants PLEASE DO NOT REPLY TO THIS EMAIL OR START A DISCUSSISON THREAD UNTIL THE ISSUE IS ASSIGNED A NUMBER. The issues coordinators will notify the list when that has occurred. Target document and draft: Protocol: WS-AT Volatile 2PC Artifact: spec Draft: AT spec cd 1 Link to the document referenced:http://www.oasis-open.org/committees/download.php/17325/wstx-wsat-1.1-sp ec-cd-01.pdfSection and PDF line number: section 10, lines 503/505 tables rows Prepared, Relay, column None Issue type: Design Related issues: New issue: WS-AT: Invalid state inappropriate response to orphaned volatile participants Issue Description: The requirement that a coordinator with no current knowledge of a participant can determine which protocol it previously registered withfrom an incoming Prepared or Replay messages imposes unnecessary requirements on implementations. The desired functionality can be achieved with a common response to all such messages from "orphans", and letting the participant interpret that response. Since a returned fault does not show in the state tables, there is currently no behaviour specified for the volatile participant that sent the Replay or Prepared (this will become a separate issue if the main one is rejected, but it gets subsumed if this is fixed) Issue Details Background: The None state for WS-AT means that there is no record of the transaction - either because the transaction completed (rolledback or committed) or the coordinator crashed (and thus implicitly rolledback). Among correct implementations, Prepared and Replay messages received in the None state can only come from participants that were registered prior to the completion or crash, but did not receive a Commit or Rollback message. For a durable participant, this cannot happen if the transaction committed - the coordinator cannot forget the transaction until it hasreceived Committed from the durable participant. Thus if the Prepared or Replay are received from a durable participant in the None state, it can be inferred with certainty that the transaction rolledback (by intent or crash). For volatile participants however, it is legitimate for the coordinator to delete its commit log record after receiving Committed from all durable participants, without waiting for any response from volatile participants. If there is a crash at this point (or just somelost messages), Prepared or Replay can be received from a volatile participant and find the coordinator in state "None". The issue: The specification currently requires that the coordinator distinguish between the two kinds of registration. This in turn requires that there is something about the Prepared or Replay message that allows the coordinator to determine the protocol of the forgotten registration (ipso facto, it has no specific, local information). Since there is nothing in the Prepared or Replay messages as such, this can only be something carried in or implied from the addressing for the coordinator. Although this can of course be done, and for someimplementations may be entirely natural (e.g. if they have twinned-but-distinct multi-lateral coordinators for volatile and durable), for others it will be unnatural. There seems no reason for imposing this peculiarity (other perhaps than the unstated aesthetic that the coordinator has the driving seat in this protocol). If the coordinator gave the same response to Prepared/Replay from either protocol, it would be up to the participant to determine how it should treat this response. Since the participant certainly knows how it registered, there would be no ambiguity. (This issue has been distinguished from the question of using Invalid State in the volatile case - it would be even worse to use for InvalidState for the both protocols) There is currently no defined behaviour for the volatile participant that sends Replay or Prepared (since fault behaviour is not defined). Proposed resolution Define new message UnknownTransactionOutcome. Amend state table for Coordinator to send the new message when Prepared or Replay are received in state None. Amend state table for Participant with row for inbound "UnknownTransactionOutcome" this is invalid (shouldn't happen) in all states exceptPreparedSuccessin PreparedSuccess if participant is Durable action is Initiate Rollback and Forget, transition to Aborting if participant is Volatile, action is Forget, transition to new state UnknownOutcome UnknownOutcome state has transition to None on All Forgotten event. All other Inbound Events are ignored, without transition. |
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]