OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

ws-tx message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: RE: [ws-tx] Issue 039 - WS-AT: Coordinator should not distinguish protocol of orphaned participants


I think Alastair is right - when I sent
 
"ii) if a durable participant is in PreparedSuccess and the coordinator is in None, it follows logically that the transaction rolledback." (and similar statements in the initial issue and various emails), I did not allow for the lazy delete that is mandated by the state tables.
 
If lazy delete (i.e. allowing the sequence: apply commit/rollback ; send acknowledgement to coordinator; delete prepared log) is permitted, then it is not possible to determine the transaction outcome for durable. Anything other than an agnostic "UnknownTransaction" is going beyond the evidence.
 
Peter


From: Alastair Green [mailto:alastair.green@choreology.com]
Sent: 18 May 2006 13:44
To: Ram Jeyaraman
Cc: Mark Little; Peter Furniss; ws-tx@lists.oasis-open.org
Subject: Re: [ws-tx] Issue 039 - WS-AT: Coordinator should not distinguish protocol of orphaned participants

Ram,

I don't find this argument convincing. You say:
"The superior clearly holds a responsibility in the 2PC protocol to tell the participant about the outcome"
If you rephrased that to say: "to tell the participant about the outcome at least once" then that would be true. As phrased, in some recovery scenarios it cannot be true: the knowledge of the outcome has crept into P-side records, and has departed from C's memory.

When C receives Prepared/Replay in the None state it does not know what the outcome was. It has literally forgotten. It cannot communicate the outcome to P. It can only communicate the semantic: "I don't know the outcome".

This semantic may be sent in circumstances where the outcome was Commit, and not just when it was Rollback. This flows from the lazy log delete in the PV tables. (This is an important extra element which changes, but strengthens, Peter's arguments in an earlier post.)

The message that represents that semantic should therefore not be called Rollback, it should be called UnknownOutcome, whether volatile or durable.

P may be able to deduce the outcome, given that information, and knowledge of the state of the persistent work that it controls. It will be able to guarantee that persistent data resources are correctly modified, even if it is unaware of the outcome.

Let's take the durable case.

C can be in state None because the transaction committed or because it rolled back.

C gets into None by virtue of receiving Aborted or Committed.

Both of those messages are sent by P before the P Forget is initiated, yet alone completed. P may send the message and fail, either after sending Aborted or Committed (see my discussion of potential bug under aegis of Issue 048). It may therefore recover and replay Prepared/send Replay in either case.

The knowledge of the final outcome may rest with P, or a combination of P and C, but never with C alone. Sometimes it is never known.

Recall that the PV state machine interacts with an unspecified entity to Initiate Commit Decision, or to Initiate Rollback: I've called that entity the Application Entity or AE).

AE could be a client to a presumed abort DBMS.

If the client has forgotten the transaction then the P can assume that the rollback or commit was successfully processed, prior to P failure. This could be established with a query if the client supports enquiry of known transactions. In this case there is no need for P to talk to C, and P, like C, has no idea what the outcome actually was.

If the client cannot be queried, then the DBMS client is either finished, or prepared. P cannot know the outcome, because the transaction may not yet have completed (no Commit or Rollback yet received), so it has to replay Prepared (or send Replay). There are three answers that C can give (that actually reflect its knowledge): unknown, commit, rollback.

C does not know that unknown equals rollback. P asks C what to do, and C says: "I don't know". At that point P attempts rollback, because that is always safe if the outcome is unknown. If the transaction completed then the client will say "transaction unknown" (it either committed or rolledback, and is unknown); or it will will respond OK to the rollback. This is why it is reasonable that the action on receipt of Unknown Transaction is Initiate Rollback. This does not make it reasonable for C to send Rollback.

So, in this case P may become aware of the outcome, but only because of knowledge lodged in the AE (the database).

What happens if the AE is not a presumed-abort database?

If it is a the top half of an interposed coordinator then we can assume that the state of the AE (the sub-coordinator) is known to it. If the decision being processed was Commit, then the message Committed will not travel to the superior coordinator until the sub-coordinator has logged the Commit decision (that is the work of the AE in this case -- assuming that we follow the log-sub-coordinator-commit rule implied by the state tables). If P recovers then it can enquire of the sub-coordinator if it has started commit processing, and will only contact the C if that is not true. Once again, the state of the AE determines the action of P. In this case P does know that "unknown transaction" means that the transaction has rolledback.

If P is working with other types of resource, then we can assume that either they behave similarly to presumed-abort database clients, or that they have logged the decision they are working on as part of their AE work, like the sub-coordinator.

C in the None state, far from knowing the outcome of the transaction, is in fact unable to know the outcome of the transaction even if contacted by P with a Prepared/Replay. P may contact C in that state, both if the outcome was committed and if it was aborted. P may deduce the outcome, but that is not guaranteed (or important). What is guaranteed is that the AE always gets the correct outcome, irrespective of failures.

Alastair

Ram Jeyaraman wrote:
Mark,

As I pointed to earlier, a superior coordinator holds a responsibility
to provide correct information about the outcome.  Stating "I don't
know" and letting the participant make a unilateral conclusion about the
outcome is not a desirable solution. This leads to varied
interpretations and may cause inconsistent information to be reported to
higher layers.

The fact that a participant is requesting replay means that the
participant does not know what to do. Replying 'I don't know' does not
really help the participant. The superior clearly holds a responsibility
in the 2PC protocol to tell the participant about the outcome. It is
very important not to take away that simple, yet powerful, assumption.

-----Original Message-----
From: Mark Little [mailto:mark.little@jboss.com] 
Sent: Saturday, May 13, 2006 1:57 AM
To: Ram Jeyaraman
Cc: Peter Furniss; ws-tx@lists.oasis-open.org
Subject: Re: [ws-tx] Issue 039 - WS-AT: Coordinator should not
distinguish protocol of orphaned participants

As you can see from previous emails, I think we shouldn't mandate that 
EPRs for Volatile or Durable participants are different, but push the 
intelligence back to the participant, which already knows what type it 
is (and hence how to interpret the message). This does not preclude 
implementations from have different EPRs per protocol, but it is not 
required. Yes, we'll need to introduce an UnknownTransaction fault, but 
I think we need that anyway.

Mark.


Ram Jeyaraman wrote:
  
Peter,

I am sure you wouldn't disagree: A superior coordinator holds a 
responsibility to provide correct information about the outcome. 
Stating "I don't know" and letting the subordinate make a unilateral 
conclusion about the outcome is not a desirable situation. This gives 
room for varied interpretations of "I don't know", which is not 
desirable. As transaction middleware providers we should attempt to 
provide the best information possible, and it is critical that we 
build that assumption into the protocol.

For durable participants, I strongly support Option A, you had 
proposed earlier.

In the case of volatile participants, I agree that a more meaningful 
fault, such as "UnknownTransaction" or perhaps "InDoubt", is better 
than "InvalidState".

Thank you.


    
------------------------------------------------------------------------
  
*From:* Peter Furniss [mailto:peter.furniss@erebor.co.uk]
*Sent:* Wednesday, May 10, 2006 11:52 PM
*To:* Ram Jeyaraman; ws-tx@lists.oasis-open.org
*Subject:* RE: [ws-tx] Issue 039 - WS-AT: Coordinator should not 
distinguish protocol of orphaned participants

Ram,

It isn't a question of a more meaningful message, but of which side is
    

  
required to distinguish what the now-broken relationship was:

a) the side that now knows nothing of the relationship and therefore 
has to juggle its own addressing at implementation time so it can 
infer the answer

b) the side that does know of the relationship and needs to behave 
differently

Peter


    
------------------------------------------------------------------------
  
*From:* Ram Jeyaraman [mailto:Ram.Jeyaraman@microsoft.com]
*Sent:* 11 May 2006 03:34
*To:* Peter Furniss; ws-tx@lists.oasis-open.org
*Subject:* RE: [ws-tx] Issue 039 - WS-AT: Coordinator should not 
distinguish protocol of orphaned participants

Peter,

It is desirable to send a participant a more meaningful message. To 
that end, option (a) you have described below seems quite reasonable.


    
------------------------------------------------------------------------
  
*From:* Peter Furniss [mailto:peter.furniss@erebor.co.uk]
*Sent:* Saturday, May 06, 2006 4:02 AM
*To:* Ram Jeyaraman; ws-tx@lists.oasis-open.org
*Subject:* RE: [ws-tx] Issue 039 - WS-AT: Coordinator should not 
distinguish protocol of orphaned participants

But there is no need for a coordinator in None to distinguish.

In any state other than None, the coordinator of course knows which 
protocol is in use, because it has an established relationship (and a 
state-referencing endpoint) from the Register. So considerations there
    

  
(i.e. the other state transitions) are irrelevant to this matter.

In state None, the coordinator has no such relationship. It doesn't 
even really have a state-referencing endpoint - there's just some 
catcher for incoming messages that target endpoints that don't exist 
(e.g. the despatcher triggers its else/default actions). It doesn't 
care. It isn't going to change state. It isn't going to do anything 
about the message except reply to it.

The only requirement on our protocol is that the participant (which 
obviously does know what it registered for, and does care) is not 
mislead. And, as explained in the issue, a participant in Prepared 
that is told the coordinator is in state None does different things 
depending on whether it was durable or volatile.

There are two solutions:

a) we require that there be something in, associated with, enveloping 
or otherwise marking the Prepared message such that the coordinator 
can determine what kind of participant sent it, and send back 
different signals. We don't need to mandate a particular means, but 
there must be something on the wire that distinguishes a Prepared from
    

  
volatile with Prepared from durable

b) a coordinator in state None sends back a reply to Prepared from any
    

  
kind of participant, the reply distinguishably telling the participant
    

  
the coordinator is in None, and the participant interprets this 
appropriately dependent on what kind it is.

Replying "UnknownTransaction", regardless of protocol, simplifies the 
protocol. Coordinator implementations don't need to do anything
    
special.
  
Peter


    
------------------------------------------------------------------------
  
*From:* Ram Jeyaraman [mailto:Ram.Jeyaraman@microsoft.com]
*Sent:* 06 May 2006 03:07
*To:* ws-tx@lists.oasis-open.org
*Subject:* RE: [ws-tx] Issue 039 - WS-AT: Coordinator should not 
distinguish protocol of orphaned participants

This issue poses a valid question: How to handle the receipt of a 
'Prepared' message while the coordinator state machine is in 'None'
    
state.
  
But in that question is the answer: The state transitions imply that 
an implementation should distinguish between volatile and durable 
participants; for example, as outlined in this issue description. That
    

  
is, an implementation is expected to achieve this in an implementation
    

  
specific way.

Mandating a specific mechanism is unnecessary since this can be 
achieved via a suitable mechanism that works best for an 
implementation. Further, we certainly want implementations to 
distinguish between volatile and durable participants, so that 
messages from durable participants can be meaningfully handled.


    
------------------------------------------------------------------------
  
*From:* Ram Jeyaraman [mailto:Ram.Jeyaraman@microsoft.com]
*Sent:* Tuesday, March 28, 2006 10:22 AM
*To:* ws-tx@lists.oasis-open.org
*Subject:* [ws-tx] Issue 039 - WS-AT: Coordinator should not 
distinguish protocol of orphaned participants

This is identified as WS-TX issue 039.

Please ensure follow-ups have a subject line starting "Issue 039 - 
WS-AT: Coordinator should not distinguish protocol of orphaned 
participants".


    
------------------------------------------------------------------------
  
*From:* Peter Furniss [mailto:peter.furniss@erebor.co.uk]
*Sent:* Monday, March 27, 2006 1:27 PM
*To:* ws-tx@lists.oasis-open.org
*Subject:* [ws-tx] WS-AT: Coordinator should not distinguish protocol 
of orphaned participants

Issue name -- WS-AT: Coordinator should not distinguish protocol of 
orphaned participants

PLEASE DO NOT REPLY TO THIS EMAIL OR START A DISCUSSISON THREAD UNTIL 
THE ISSUE IS ASSIGNED A NUMBER.

The issues coordinators will notify the list when that has occurred.

Target document and draft:

Protocol: WS-AT Volatile 2PC

Artifact: spec

Draft:

AT spec cd 1

Link to the document referenced:


    
http://www.oasis-open.org/committees/download.php/17325/wstx-wsat-1.1-sp
ec-cd-01.pdf
  
Section and PDF line number:

section 10, lines 503/505 tables rows Prepared, Relay, column None


Issue type:

Design


Related issues:

New issue: WS-AT: Invalid state inappropriate response to orphaned 
volatile participants

Issue Description:

The requirement that a coordinator with no current knowledge of a 
participant can determine which protocol it previously registered with
    

  
from an incoming Prepared or Replay messages imposes unnecessary 
requirements on implementations. The desired functionality can be 
achieved with a common response to all such messages from "orphans", 
and letting the participant interpret that response.

Since a returned fault does not show in the state tables, there is 
currently no behaviour specified for the volatile participant that 
sent the Replay or Prepared (this will become a separate issue if the 
main one is rejected, but it gets subsumed if this is fixed)

Issue Details

Background:
The None state for WS-AT means that there is no record of the 
transaction - either because the transaction completed (rolledback or 
committed) or the coordinator crashed (and thus implicitly 
rolledback). Among correct implementations, Prepared and Replay 
messages received in the None state can only come from participants 
that were registered prior to the completion or crash, but did not 
receive a Commit or Rollback message.

For a durable participant, this cannot happen if the transaction 
committed - the coordinator cannot forget the transaction until it has
    

  
received Committed from the durable participant. Thus if the Prepared 
or Replay are received from a durable participant in the None state, 
it can be inferred with certainty that the transaction rolledback (by 
intent or crash).

For volatile participants however, it is legitimate for the 
coordinator to delete its commit log record after receiving Committed 
from all durable participants, without waiting for any response from 
volatile participants. If there is a crash at this point (or just some
    

  
lost messages), Prepared or Replay can be received from a volatile 
participant and find the coordinator in state "None".

The issue:
The specification currently requires that the coordinator distinguish 
between the two kinds of registration. This in turn requires that 
there is something about the Prepared or Replay message that allows 
the coordinator to determine the protocol of the forgotten 
registration (ipso facto, it has no specific, local information). 
Since there is nothing in the Prepared or Replay messages as such, 
this can only be something carried in or implied from the addressing 
for the coordinator. Although this can of course be done, and for some
    

  
implementations may be entirely natural (e.g. if they have 
twinned-but-distinct multi-lateral coordinators for volatile and 
durable), for others it will be unnatural.

There seems no reason for imposing this peculiarity (other perhaps 
than the unstated aesthetic that the coordinator has the driving seat 
in this protocol). If the coordinator gave the same response to 
Prepared/Replay from either protocol, it would be up to the 
participant to determine how it should treat this response. Since the 
participant certainly knows how it registered, there would be no 
ambiguity.

(This issue has been distinguished from the question of using Invalid 
State in the volatile case - it would be even worse to use for 
InvalidState for the both protocols)

There is currently no defined behaviour for the volatile participant 
that sends Replay or Prepared (since fault behaviour is not defined).

Proposed resolution

Define new message UnknownTransactionOutcome.

Amend state table for Coordinator to send the new message when 
Prepared or Replay are received in state None.

Amend state table for Participant with row for inbound 
"UnknownTransactionOutcome"
this is invalid (shouldn't happen) in all states except
    
PreparedSuccess
  
in PreparedSuccess
if participant is Durable action is Initiate Rollback and Forget, 
transition to Aborting

if participant is Volatile, action is Forget, transition to new state 
UnknownOutcome

UnknownOutcome state has transition to None on All Forgotten event. 
All other Inbound Events are ignored, without transition.

    

  


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]