OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

ws-tx message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [ws-tx] Issue 039 - WS-AT: Coordinator should not distinguishprotocol of orphaned participants


Woa, hold on there. I haven't been able to read emails much this week, 
but scanning this one, the following sticks out like a sore thumb:

> Peter,
>
> Step 5 in your example below, the participant should log the fact that
> it had successfully committed upon arriving at the commit decision, and
>   

We're now saying that participants must remember the fact that they've 
committed? Where's this come from? I know of some participant 
implementations that do this in the real world, but they are usually the 
exception to the rule. Normally before sending the commit, the 
participant "simply" makes the "after image" the "current image" and 
deletes any log it may have held. It only need record information during 
commit (or rollback) if it makes a heuristic choice (which we've ruled 
out of scope for the TC). Let's step back from this one and think again.

There are other comments I could make with respect to your (Ram's) email 
below, but I don't have the time just yet. If no one else gets there in 
the next day or so, I'll try to pull something together.

Mark.

> only then send out a "Committed" message. Consequently, the participant
> that crashes after logging the commit decision, should not send out a
> "Replay" message.
>
> This is a bug in the 2PC PV state table:
>
> "Commit decision/Committing: Send Committed and Forget; Committing"
>
> should be
>
> "Commit decision/Committing: Record Commit; Committing"
> "Write Done/Committing: Send Committed and Forget; Committing".
>
> Thus, a coordinator that had indeed successfully committed the
> transaction and moved to 'None' state, should not, in general, receive a
> 'Prepared' message, since all participants must have been successfully
> informed of the outcome, in order for the transaction to commit.
>
> Alternatively, if a coordinator had aborted the transaction and moved to
> 'None' state because one of its participants voted to abort the
> transaction, or because one of the participants was unreachable. In the
> latter case, it is possible for the coordinator in the  None' state to
> receive a 'Prepared' message from a previously unreachable participant.
>
> Hence, it is appropriate for a coordinator in "None" state to send
> "Rollback" message to a durable participant, upon receiving a "Prepared"
> or "Replay" message.
>
> -----Original Message-----
> From: Peter Furniss [mailto:peter.furniss@erebor.co.uk] 
> Sent: Friday, May 19, 2006 5:52 AM
> To: Ram Jeyaraman; Mark Little
> Cc: ws-tx@lists.oasis-open.org
> Subject: RE: [ws-tx] Issue 039 - WS-AT: Coordinator should not
> distinguish protocol of orphaned participants
>
> Ram,
>
> I'm afraid your first paragraph doesn't apply to volatile, for reasons
> that Max explained to me at the Jan 2005 interop and I summarised in the
> issue, nor even to durable, for reasons that Alastair pointed out
> yesterday, because the state tables mandates or permits "lazy log
> delete" in commitment.
>
>
> Take the case that applies to durable (since that applies to volatile as
> well anyway). We join the story half-way:
>
> 1 Particpant logs prepared (Write Done/Preparing => send Prepared;
> PreparedSuccess)
>
> 2 Coordinator receives Prepared from this and all other participants
> (Commit Decision/Preparing => RecordOutcome; PreparedSuccess) 
>
> 3 Coordinator log write ok (WriteDone/PreparedSuccess => send Commit;
> Committing)
>
> 4 Participant receives Commit (Commit/PreparedSuccess => Initiate Commit
> Decision; Committing)
>
> 5 Participant has successfully applied commit to its resources
> (CommitDecision/Committing => send Committed and Forget; Committing)
>
> 6 Participant crashes.
>
> 7 Coordinator receives Committed (Committed/Committing => Forget;
> Committing) (ditto from all other participants)
>
> 8 Coordinator completes log removal (All Forgotten/Committing => ; None)
>
> 9 Participant recovers. Since the Forget action hadn't completed (log
> unchanged since 1), it will recover in PreparedSuccess. Sends Replay (or
> Prepared, makes no difference) to find out what to do.
>
> 10 Coordinator receives Replay|Prepared (Replay/None => ??? ; None)
> 	At this point, the truth is the transaction committed, but we
> have received Prepared in state None.  To those who fear this is unsafe,
> what happens next is:
>
> 11 Participant receives ??? ( ???/PreparedSuccess => Initiate Rollback
> and Forget; Aborting)
>
> 12 Participant's resources do nothing - they committed earlier AND HAVE
> RECOVERED AS COMMITTED.
>
> 13 Participant completes log removal (All Forgotten/Aborting => ; None)
>
>
> Lazy delete is the action at 5 - some other specs wouldn't allow it, but
> in fact it's safe if (and only if) the statement at 12 can be guaranteed
> as true.  But since it is allowed by this table, the Coordinator does
> not know the true outcome at 10, and cannot have the reponsibility of
> telling the Participant. And the Participant doesn't determine the right
> answer at 11 either, but performs operations that are safe.
>
> Peter
>
>
> -----Original Message-----
> From: Ram Jeyaraman [mailto:Ram.Jeyaraman@microsoft.com] 
> Sent: 19 May 2006 03:38
> To: Peter Furniss; Mark Little
> Cc: ws-tx@lists.oasis-open.org
> Subject: RE: [ws-tx] Issue 039 - WS-AT: Coordinator should not
> distinguish protocol of orphaned participants
>
> Peter,
>
> Let us analyze this.
>
> If a coordinator had indeed successfully committed the transaction and
> moved to 'None' state, it should not, in general receive a 'Prepared'
> message, since all participants must have been successfully informed of
> the outcome, in order for the transaction to commit.
>
> On the other hand, if a coordinator could have aborted the transaction
> and moved to 'None' state, because one of its participants voted to
> abort the transaction; Or, it could be that one of the participants
> misbehaved or perhaps unreachable. In this latter case, it is still
> possible for the coordinator in the 'None' state to receive a 'Prepared'
> message from a previously unreachable participant. Sending a rollback is
> the right thing to do.
>
> -----Original Message-----
> From: Peter Furniss [mailto:peter.furniss@erebor.co.uk]
> Sent: Thursday, May 18, 2006 1:59 AM
> To: Ram Jeyaraman; Mark Little
> Cc: ws-tx@lists.oasis-open.org
> Subject: RE: [ws-tx] Issue 039 - WS-AT: Coordinator should not
> distinguish protocol of orphaned participants
>
> Ram,
>
> But the superior coordinator does NOT know the answer - that's the whole
> point. It has to deduce the answer from information supplied back to it
> by the participant. And more bizarrely still, the information ORIGINATED
> at the participant (when it registered) and then has to be encrypted in
> the EPR of the coordinator, so the coordinator can infer what the
> participant said it was when it re-addresses that (forgotten) EPR.
>
> The participant is NOT making a unilateral conclusion. The unilateral
> conclusion is made in the spec designers, and summarised in the
> background section on the issue. In brief:
>
> i) if a volatile participant is in PreparedSuccess and the coordinator
> is in None, it is not possible to determine whether the transaction
> committed or rolledback;
> ii) if a durable participant is in PreparedSuccess and the coordinator
> is in None, it follows logically that the transaction rolledback.  
>
> That isn't affected by what exchanges take place at that point - it's an
> assertion, valid for conformant implementations in this situation, that
> applies independently of the message exchanges.  
> The issue is just where in the spec we instantiate that logic. 
>
> It isn't that the Participant doesn't know what to do and the
> Coordinator does. The Coordinator doesn't even know the transaction or
> the participant exist, or ever existed. When it finds, by the arrival of
> Prepared or Replay that the Participant does exist, it cannot tell the
> Participant what happened by its own knowledge of the transaction,
> because it doesn't have any.  The choice is whether, from its ignorance,
> it then embarks on the deduction of i) and ii); or whether it just
> replies that it has no information, and then the participant BY THE SAME
> IRON LOGIC performs the deduction.  Except that to make the distinction
> between i) and ii), the coordinator has to ensure it was re-informed via
> some private trickery in the EPR; whereas the Participant has the
> information immediately to hand at the time it needs to distinguish.
>
> I tried to think of analogy, but the situation is so bizarre that I
> can't come up with one that would help.
>
> Peter
>
>
>
>
> -----Original Message-----
> From: Ram Jeyaraman [mailto:Ram.Jeyaraman@microsoft.com]
> Sent: 18 May 2006 01:47
> To: Mark Little
> Cc: Peter Furniss; ws-tx@lists.oasis-open.org
> Subject: RE: [ws-tx] Issue 039 - WS-AT: Coordinator should not
> distinguish protocol of orphaned participants
>
> Mark,
>
> As I pointed to earlier, a superior coordinator holds a responsibility
> to provide correct information about the outcome.  Stating "I don't
> know" and letting the participant make a unilateral conclusion about the
> outcome is not a desirable solution. This leads to varied
> interpretations and may cause inconsistent information to be reported to
> higher layers.
>
> The fact that a participant is requesting replay means that the
> participant does not know what to do. Replying 'I don't know' does not
> really help the participant. The superior clearly holds a responsibility
> in the 2PC protocol to tell the participant about the outcome. It is
> very important not to take away that simple, yet powerful, assumption.
>
> -----Original Message-----
> From: Mark Little [mailto:mark.little@jboss.com]
> Sent: Saturday, May 13, 2006 1:57 AM
> To: Ram Jeyaraman
> Cc: Peter Furniss; ws-tx@lists.oasis-open.org
> Subject: Re: [ws-tx] Issue 039 - WS-AT: Coordinator should not
> distinguish protocol of orphaned participants
>
> As you can see from previous emails, I think we shouldn't mandate that
> EPRs for Volatile or Durable participants are different, but push the
> intelligence back to the participant, which already knows what type it
> is (and hence how to interpret the message). This does not preclude
> implementations from have different EPRs per protocol, but it is not
> required. Yes, we'll need to introduce an UnknownTransaction fault, but
> I think we need that anyway.
>
> Mark.
>
>
> Ram Jeyaraman wrote:
>   
>> Peter,
>>
>> I am sure you wouldn't disagree: A superior coordinator holds a 
>> responsibility to provide correct information about the outcome.
>> Stating "I don't know" and letting the subordinate make a unilateral 
>> conclusion about the outcome is not a desirable situation. This gives 
>> room for varied interpretations of "I don't know", which is not 
>> desirable. As transaction middleware providers we should attempt to 
>> provide the best information possible, and it is critical that we 
>> build that assumption into the protocol.
>>
>> For durable participants, I strongly support Option A, you had 
>> proposed earlier.
>>
>> In the case of volatile participants, I agree that a more meaningful 
>> fault, such as "UnknownTransaction" or perhaps "InDoubt", is better 
>> than "InvalidState".
>>
>> Thank you.
>>
>>
>>     
> ------------------------------------------------------------------------
>   
>> *From:* Peter Furniss [mailto:peter.furniss@erebor.co.uk]
>> *Sent:* Wednesday, May 10, 2006 11:52 PM
>> *To:* Ram Jeyaraman; ws-tx@lists.oasis-open.org
>> *Subject:* RE: [ws-tx] Issue 039 - WS-AT: Coordinator should not 
>> distinguish protocol of orphaned participants
>>
>> Ram,
>>
>> It isn't a question of a more meaningful message, but of which side is
>>     
>
>   
>> required to distinguish what the now-broken relationship was:
>>
>> a) the side that now knows nothing of the relationship and therefore 
>> has to juggle its own addressing at implementation time so it can 
>> infer the answer
>>
>> b) the side that does know of the relationship and needs to behave 
>> differently
>>
>> Peter
>>
>>
>>     
> ------------------------------------------------------------------------
>   
>> *From:* Ram Jeyaraman [mailto:Ram.Jeyaraman@microsoft.com]
>> *Sent:* 11 May 2006 03:34
>> *To:* Peter Furniss; ws-tx@lists.oasis-open.org
>> *Subject:* RE: [ws-tx] Issue 039 - WS-AT: Coordinator should not 
>> distinguish protocol of orphaned participants
>>
>> Peter,
>>
>> It is desirable to send a participant a more meaningful message. To 
>> that end, option (a) you have described below seems quite reasonable.
>>
>>
>>     
> ------------------------------------------------------------------------
>   
>> *From:* Peter Furniss [mailto:peter.furniss@erebor.co.uk]
>> *Sent:* Saturday, May 06, 2006 4:02 AM
>> *To:* Ram Jeyaraman; ws-tx@lists.oasis-open.org
>> *Subject:* RE: [ws-tx] Issue 039 - WS-AT: Coordinator should not 
>> distinguish protocol of orphaned participants
>>
>> But there is no need for a coordinator in None to distinguish.
>>
>> In any state other than None, the coordinator of course knows which 
>> protocol is in use, because it has an established relationship (and a 
>> state-referencing endpoint) from the Register. So considerations there
>>     
>
>   
>> (i.e. the other state transitions) are irrelevant to this matter.
>>
>> In state None, the coordinator has no such relationship. It doesn't 
>> even really have a state-referencing endpoint - there's just some 
>> catcher for incoming messages that target endpoints that don't exist 
>> (e.g. the despatcher triggers its else/default actions). It doesn't 
>> care. It isn't going to change state. It isn't going to do anything 
>> about the message except reply to it.
>>
>> The only requirement on our protocol is that the participant (which 
>> obviously does know what it registered for, and does care) is not 
>> mislead. And, as explained in the issue, a participant in Prepared 
>> that is told the coordinator is in state None does different things 
>> depending on whether it was durable or volatile.
>>
>> There are two solutions:
>>
>> a) we require that there be something in, associated with, enveloping 
>> or otherwise marking the Prepared message such that the coordinator 
>> can determine what kind of participant sent it, and send back 
>> different signals. We don't need to mandate a particular means, but 
>> there must be something on the wire that distinguishes a Prepared from
>>     
>
>   
>> volatile with Prepared from durable
>>
>> b) a coordinator in state None sends back a reply to Prepared from any
>>     
>
>   
>> kind of participant, the reply distinguishably telling the participant
>>     
>
>   
>> the coordinator is in None, and the participant interprets this 
>> appropriately dependent on what kind it is.
>>
>> Replying "UnknownTransaction", regardless of protocol, simplifies the 
>> protocol. Coordinator implementations don't need to do anything
>>     
> special.
>   
>> Peter
>>
>>
>>     
> ------------------------------------------------------------------------
>   
>> *From:* Ram Jeyaraman [mailto:Ram.Jeyaraman@microsoft.com]
>> *Sent:* 06 May 2006 03:07
>> *To:* ws-tx@lists.oasis-open.org
>> *Subject:* RE: [ws-tx] Issue 039 - WS-AT: Coordinator should not 
>> distinguish protocol of orphaned participants
>>
>> This issue poses a valid question: How to handle the receipt of a 
>> 'Prepared' message while the coordinator state machine is in 'None'
>>     
> state.
>   
>> But in that question is the answer: The state transitions imply that 
>> an implementation should distinguish between volatile and durable 
>> participants; for example, as outlined in this issue description. That
>>     
>
>   
>> is, an implementation is expected to achieve this in an implementation
>>     
>
>   
>> specific way.
>>
>> Mandating a specific mechanism is unnecessary since this can be 
>> achieved via a suitable mechanism that works best for an 
>> implementation. Further, we certainly want implementations to 
>> distinguish between volatile and durable participants, so that 
>> messages from durable participants can be meaningfully handled.
>>
>>
>>     
> ------------------------------------------------------------------------
>   
>> *From:* Ram Jeyaraman [mailto:Ram.Jeyaraman@microsoft.com]
>> *Sent:* Tuesday, March 28, 2006 10:22 AM
>> *To:* ws-tx@lists.oasis-open.org
>> *Subject:* [ws-tx] Issue 039 - WS-AT: Coordinator should not 
>> distinguish protocol of orphaned participants
>>
>> This is identified as WS-TX issue 039.
>>
>> Please ensure follow-ups have a subject line starting "Issue 039 -
>> WS-AT: Coordinator should not distinguish protocol of orphaned 
>> participants".
>>
>>
>>     
> ------------------------------------------------------------------------
>   
>> *From:* Peter Furniss [mailto:peter.furniss@erebor.co.uk]
>> *Sent:* Monday, March 27, 2006 1:27 PM
>> *To:* ws-tx@lists.oasis-open.org
>> *Subject:* [ws-tx] WS-AT: Coordinator should not distinguish protocol 
>> of orphaned participants
>>
>> Issue name -- WS-AT: Coordinator should not distinguish protocol of 
>> orphaned participants
>>
>> PLEASE DO NOT REPLY TO THIS EMAIL OR START A DISCUSSISON THREAD UNTIL 
>> THE ISSUE IS ASSIGNED A NUMBER.
>>
>> The issues coordinators will notify the list when that has occurred.
>>
>> Target document and draft:
>>
>> Protocol: WS-AT Volatile 2PC
>>
>> Artifact: spec
>>
>> Draft:
>>
>> AT spec cd 1
>>
>> Link to the document referenced:
>>
>>
>>     
> http://www.oasis-open.org/committees/download.php/17325/wstx-wsat-1.1-sp
> ec-cd-01.pdf
>   
>> Section and PDF line number:
>>
>> section 10, lines 503/505 tables rows Prepared, Relay, column None
>>
>>
>> Issue type:
>>
>> Design
>>
>>
>> Related issues:
>>
>> New issue: WS-AT: Invalid state inappropriate response to orphaned 
>> volatile participants
>>
>> Issue Description:
>>
>> The requirement that a coordinator with no current knowledge of a 
>> participant can determine which protocol it previously registered with
>>     
>
>   
>> from an incoming Prepared or Replay messages imposes unnecessary 
>> requirements on implementations. The desired functionality can be 
>> achieved with a common response to all such messages from "orphans", 
>> and letting the participant interpret that response.
>>
>> Since a returned fault does not show in the state tables, there is 
>> currently no behaviour specified for the volatile participant that 
>> sent the Replay or Prepared (this will become a separate issue if the 
>> main one is rejected, but it gets subsumed if this is fixed)
>>
>> Issue Details
>>
>> Background:
>> The None state for WS-AT means that there is no record of the 
>> transaction - either because the transaction completed (rolledback or
>> committed) or the coordinator crashed (and thus implicitly 
>> rolledback). Among correct implementations, Prepared and Replay 
>> messages received in the None state can only come from participants 
>> that were registered prior to the completion or crash, but did not 
>> receive a Commit or Rollback message.
>>
>> For a durable participant, this cannot happen if the transaction 
>> committed - the coordinator cannot forget the transaction until it has
>>     
>
>   
>> received Committed from the durable participant. Thus if the Prepared 
>> or Replay are received from a durable participant in the None state, 
>> it can be inferred with certainty that the transaction rolledback (by 
>> intent or crash).
>>
>> For volatile participants however, it is legitimate for the 
>> coordinator to delete its commit log record after receiving Committed 
>> from all durable participants, without waiting for any response from 
>> volatile participants. If there is a crash at this point (or just some
>>     
>
>   
>> lost messages), Prepared or Replay can be received from a volatile 
>> participant and find the coordinator in state "None".
>>
>> The issue:
>> The specification currently requires that the coordinator distinguish 
>> between the two kinds of registration. This in turn requires that 
>> there is something about the Prepared or Replay message that allows 
>> the coordinator to determine the protocol of the forgotten 
>> registration (ipso facto, it has no specific, local information).
>> Since there is nothing in the Prepared or Replay messages as such, 
>> this can only be something carried in or implied from the addressing 
>> for the coordinator. Although this can of course be done, and for some
>>     
>
>   
>> implementations may be entirely natural (e.g. if they have 
>> twinned-but-distinct multi-lateral coordinators for volatile and 
>> durable), for others it will be unnatural.
>>
>> There seems no reason for imposing this peculiarity (other perhaps 
>> than the unstated aesthetic that the coordinator has the driving seat 
>> in this protocol). If the coordinator gave the same response to 
>> Prepared/Replay from either protocol, it would be up to the 
>> participant to determine how it should treat this response. Since the 
>> participant certainly knows how it registered, there would be no 
>> ambiguity.
>>
>> (This issue has been distinguished from the question of using Invalid 
>> State in the volatile case - it would be even worse to use for 
>> InvalidState for the both protocols)
>>
>> There is currently no defined behaviour for the volatile participant 
>> that sends Replay or Prepared (since fault behaviour is not defined).
>>
>> Proposed resolution
>>
>> Define new message UnknownTransactionOutcome.
>>
>> Amend state table for Coordinator to send the new message when 
>> Prepared or Replay are received in state None.
>>
>> Amend state table for Participant with row for inbound 
>> "UnknownTransactionOutcome"
>> this is invalid (shouldn't happen) in all states except
>>     
> PreparedSuccess
>   
>> in PreparedSuccess
>> if participant is Durable action is Initiate Rollback and Forget, 
>> transition to Aborting
>>
>> if participant is Volatile, action is Forget, transition to new state 
>> UnknownOutcome
>>
>> UnknownOutcome state has transition to None on All Forgotten event. 
>> All other Inbound Events are ignored, without transition.
>>
>>     
>
>   


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]