ws-tx message

Subject: Issue 048: PV state table definitions

From: Alastair Green <alastair.green@choreology.com>
To: Peter Furniss <peter.furniss@erebor.co.uk>
Date: Tue, 16 May 2006 09:42:21 +0100

I've been considering Peter's points below, questioning the PV state table's meaning/correctness. Here's where I've got to: it's complicated, and I may be missing something, so critical review would be very welcome.

1. The model of the PV state table implicitly contains logical entities in an overall participant: the PV state machine (SM), a recording entity (e.g. transaction logging module) (RE), and an application entity (AE) that carries out application work (details undefined) in reaction to signals from the SM, prompted by messages from the Coordinator arriving at the SM.

Defining such entities or roles makes it much easier to describe what the PV actions and events mean. Attached Word doc picture helps explain this vocabulary.

Note that AE may be, e.g., a two-phase aware RM or an interposition participant/sub-coordinator: the details of its work are encapsulated/irrelevant to the coordination protocol. This is a more abstract, general way of thinking about Peter's "bound data" (application state changes induced by the transaction protocol).

2. The operations used below are:

None    = move to state None
SA    = send Aborted to Coordinator
SC    = send Committed to Coordinator
IR    = send Initiate Rollback signal to AE
IR-return    = get notification of AE rollback completion (good or bad)
IC/ICR    = the same for commit (these are internal events Initiate Commit Decision and Commit Decision for the Prepared Success/Committing columns)
F    = send start forgetting signal to RE, start logical log record delete (equals action Forget)
F-return   = receive forgotten signal from RE, log record deleted, or = receive forget attempt failed from RE, log record delete failed (this is my interpretation of event All Forgotten in columns Committing and Aborting).

3. The fastest safe sequence in reaction to receipt of Commit by PV SM is:

IC, ICR, None, SC, F.

The spec says

IC, ICR, SC, F, F-return, None.

4. The fastest safe sequence in reaction to receipt of Rollback by PV SM is:

None, SA, IR, IR-return, F.

The spec (assuming certain plausible understandings) says:

IR, [no wait for IR-return], SA, F, F-return, None.

I think this unsafe (wait for IR-return is needed). Detailed explanation of this at bottom of mail.

In neither case is it necessary to block on the attempt to log delete (to wait for F-return) before sending the acknowledgement message to C. This design assumption of the current PV table is correct (safe), if less conservative than Gray/Reuter.

But and therefore, the state transition to None can occur earlier (after the IR-return or the F), and the entries in the row All Forgotten, columns Aborting and Committing are unnecessary: this would obviate the overload of the term All Forgotten which Peter pointed out.

I think this highlights a) a potential bug (need to block IR), b) a deoptimization/over-complication (could move to None sooner), and c) a definite need as per this issue, to clearly and unambiguously define the various actions and events to ensure that they are understood correctly. Defining and distinguishing the roles of SM, AE and RE would greatly help.

Alastair

* * *

Detailed explanation of perceived bug. Critical examination very much welcomed.

Premise: job of the PV machine is to ensure that the outcome decision always reaches the AE irrespective of C or P failures.

Note: If action "Initiate Rollback" means "start rollback and wait for the rollback operation to complete" then there is no bug. However, the clear separation of Initiate Commit Decision and Commit Decision (commit processing finished) in the commit case leads me to believe that Initiate Rollback is not a misnomer, but an intentional name.

Currently the spec says:

IR, [no wait for IR-return], SA, F, F-block, None.

If rollback processing starts (but has not completed), Aborted is sent to and received by the Coordinator, the log is deleted, and the participant crashes; the AE will revert to its prepared state (unwritten premise), and will be orphaned. C will never contact P again, P will not recover, and the rollback work of AE will never have been effected.

If we substitute:

None, SA, IR, IR-return, F.

then a crash between IR and IR-return will lead the AE to revert to its prepared state, the SM to recover in Prepared Success, it will send Prepared/Replay to the Coordinator, which in state None leads to C sending Rollback to P, which leads to re-execution of rollback processing, as expected.

Possible fixes

A) Introduce an explicit wait-for-return from the AE in the rollback case. To do this we could circuit through the Aborting column, in the same way that transit to Committing via ICR (Commit Decision) is used to introduce blocking on the completion of AE work for the commit case. This would require introduction of a Rollback Completed event (parallel to Commit Decision event).

B) We could call Initiate Rollback: "Perform Rollback", and we could call Initiate Commit Decision: "Perform Commit", define them both as blocking operations within the AE, and remove the Aborting and Committing states, as I suggested might be possible.

Prepared Success: Commit -- action: Perform Commit, Send Committed, Forget; transit state to None
Prepared Success: Rollback -- action, Send Aborted, Perform Rollback, Forget; transit state to None.

If we define Forget as non-blocking: "initiate the process of forgetting; initiate logical log delete, but do not wait for status" then we retain the existing optimization (not requiring wait for log delete before the SM communicates with the Coordinator, enabling it in turn to close the overall transaction more quickly).

These two fixes cause the transaction only to finally complete (be forgotten by the C) when all rollback work has been performed. The cost is that the demarcating application will receive a delayed notification of total completion. However, it can still receive a timely notification of the outcome decision. In the absence of heuristic reporting this is the likely point of completion notification, so I see no real cost in these fixes.

C) Introduce a back call from the AE to the SM. If an AE recovers it can communicate with SM (perhaps an event like "AE Recovered") and if the state of SM is None (unknown transaction) then the SM can reinstruct rollback. If the SM is still in Prepared Success then this signal will lead SM to communicate the outcome to the AE in the normal way. This assumes that AE and SM are capable of independent failure and recovery. If they fail together then it is likely the log delete and AE work were bound atomically (which is not prohibited by anything discussed here).

Without such a back-call the spec defines a PV machine that does not reliably deliver the outcome in all circumstances to its AE -- which is its (unstated) purpose.

The disadvantage of this fix is that it assumes that the AE can take the initiative. The entity we can assume recovers is the SM, so I prefer approach A) or B).

Note i): I have assumed that outcome of the blocking operations Perform Rollback | Commit (success or failure) is irrelevant (no heuristics), though completion is relevant.

Note ii): I have assumed that failure to log delete is of no concern to the protocol (creates an OOB notification need, related to the potential for leaving safe garbage around).

Note iii): I can't find the words "presumed abort" anywhere in the AT spec. Is that intentional?

Peter Furniss wrote:

Further interleaving (tagged <prf>)

Alastair Green sent:

Interleaved comments

Peter Furniss wrote:

That definition is sound for Commit Decision in Preparing state, but Commit Decision is also valid in the PV table for state Committing. On that occasion, we deduce from the action (send Committed and Forget) it means "I have applied the instructed commit action to my data/resources". (note also the action on receiving Commit in PreparedSuccess state, which is "Initiate Commit Decision".

I think this is a mistake in the state table. The action for Commit/Prepared Success should be Send Committed and Forget (no need to indirect through Commit Decision).

<prf>

that could sort of work if the Participant is representing just a leaf in the transaction and we aren't trying to model the bound data (i.e. the resource that is subject to the outcome of the transaction). But the Participant View has to deal with the top-half of an interposed sub-coordinator as well - that must wait for the grand-child participants to receive and respond to the commit signal. (Actually the same will normally occur if the resource is, say, a database supporting XA - the participant state entity has to relay the commit semantic onwards)

</prf>

Your suggested definition for "All Forgotten" is valid for its use in Active state, but it also appears in Committing or Aborting. There it would mean "log record has been (logically) deleted" - it's the only way to get back to None.

This is a related mistake. Forget equals logical delete of log record. Therefore, if the P receives Commit in Prepared Success, it should send Committed, forget its involvement, and move directly to None. Committing (for the P) is a meaningless state. Once in None a replay of Commit will cause Committed to be returned. A second Commit received before the state transition to None has completed should be queued until the state/action atom has completed.

I think the same applies to Rollback processing.

At first glance (haven't checked this thoroughly) it appears that the columns Committing and Aborting (these states) are simply not needed for the PV tables.

Sounds like there is a prima facie case for a new issue or issues. It also seems clear that these quirks are about rendering an understood and agreed design intent correctly (i.e they do not affect interop).

<prf>

If there are enough mistakes in or alternative interpretations of the state tables, then they don't really mean anything, and the "understood and agreed design intent" is just the text in the rest of the spec plus any implicitly assumed background of transaction theory and practice.

actually, I'm not sure one of the quirks doesn't affect interop, and could threaten transaction consistency - and your proposed change would make it worse. The state tables require the sending of Committed *before* the prepared log is deleted - it is normal in such protocols to require the delete to happen first.

take a Participant that is directly responsible for bound data - which means the prepared log will also be responsible for holding the prepared condition on that data. If that crashes in PreparedSuccess (i.e. with a log record), when the system recovers it will re-impose the prepared (e.g. locked) condition - this is precisely when the bottom-up recovery takes place. If the coordinator had in fact committed at the time of the crash, the coordinator will reply with a (repeat) Commit, the commit will be applied and all is good.

the same applies if the crash occurs after the (first) Commit has been received, but not applied (i.e. the log record is still there). Since the commit action wasn't complete, the Participant won't have sent Committed, and the Coordinator will still be in Committing - it will still have a log record, and all is good.

However, if Committed is sent *before* the commit is fully applied *including the logical deletion of the prepared log record", then crash just after sending Committed will leave the Participant in PreparedSuccess (the state it reverts to on recovery with a log record present) but the Coordinator receives the Committed, forgets and transits to None. On recovery the participant now sends Prepared/Replay, which is received by the Coordinator in state None - which returns a rollback semantic (either Rollback or UnknownTransaction, depending on another issue - the effect on the Participant is the same).

we've now got a participant that applies rollback when the transaction committed.

it might be possible to work round this by defining the Commit Decision event for a Participant View in state Committing as meaning "all bound data have been committed and won't revert to prepared in the event of failure" and that Forget is just "delete log record which mustn't have any effect on bound data", but that's getting rather hairy - and certainly is forcing implementations to work in a particular way.

(In Gray and Reuter, 10.4.2.2 the sequence for a participant is:

    commit (tell resource managers to commit)

    complete (when all rm's have replied, write completion record [which I've phrased as "delete prepared log", out of old habit]

    acknowledge "When the completion record is durable, send acknowledgement .. to coordinator"

commit, acknowledge, complete is not safe.

Unless someone can come up with some neat wording for the events that makes it clear we aren't allowing "commit - acknowledge - complete", this will have to be another issue (since there aren't any against the participant table as such at the moment)

Peter

(no further comments in the message)

</prf>

Surely the meaning of internal events shouldn't depend on the state when it occurs ?

I think that's right.

Peter

Alastair
From: Alastair Green [mailto:alastair.green@choreology.com]
Sent: 12 May 2006 15:26
To: Peter Furniss
Cc: Mark Little; Ram Jeyaraman; ws-tx@lists.oasis-open.org
Subject: Re: [ws-tx] Issue 041 - WS-AT: Invalid events should not cause defined transitions

Commit Decision in the PV means: the participant inquired of itself (of the "application" that is driven by the state machine) whether it was ready to prepare, rollback or send read-only. Commit Decision = the participant has answered its own enquiry with "yes, I want to prepare".

The least comprehensible in the PV is "All Forgotten" which means: I am going to send read-only and then forget this transaction. It might be easier to think of this as "Read Only Decision".

Alastair

Peter Furniss wrote:
Alastair asked 

"Peter, I don't understand how a message like this could be used to
respond to internal events. To whom would it be delivered?"

I was imprecise - it wouldn't actually be sent as a result of the
internal event, but a currently illegal internal event (e.g. Rollback
Decision in PreparedSuccess state) would cause a transition to Aborting
(or a variant of Aborting), InconsistentInternalState would then be an
appropriate response to Commit. (Similarly for Commit Decision going to
Committing state [1])

However, that would be a heuristic decision and heuristic report, which
is out of charter.

Peter

([1]: what is Commit Decision supposed to mean for the participant side
- it surely can't mean the same thing when it occurs in Preparing as it
does in Committing. - this is really a 048 question)



-----Original Message-----
From: Alastair Green [mailto:alastair.green@choreology.com] 
Sent: 12 May 2006 10:27
To: Mark Little
Cc: Peter Furniss; Ram Jeyaraman; ws-tx@lists.oasis-open.org
Subject: Re: [ws-tx] Issue 041 - WS-AT: Invalid events should not cause
defined transitions

I think it's pretty obvious that these two messages are intended for the
same purpose (protocol error; non-conformant counterpart, all bets are
off).

I would like to see an explanation from the original author companies
for this duplication, and a proper argument that it is not redundancy. 
Without that it seems very clear the AT message should go.

I also agree that no state transition should follow a protocol error,
i.e. the approach taken in Peter's sparse + text solution is correct. 
Sparse versus verbose is a stylistic question.

Peter, I don't understand how a message like this could be used to
respond to internal events. To whom would it be delivered?

Alastair

Mark Little wrote:
  
+1

Peter Furniss wrote:
    
I agree that distinguishing circumstances of faults is generally a 
good thing. Equally, one can also have too much of a good thing :-)
 
But the problem with InconsistentInternalState is that the definition
      
  
in the text doesn't correspond with the use in the state table.
Definition says its when the participant cannot fulfil its 
obligations. That presumably would be apply when a participant has 
gone prepared but now cannot obey the Commit or Rollback it receives 
(which sounds suspiciously like a heuristic warning which would be 
out of charter for this TC).
 
But the use in the state tables is that Participant sends it when it 
receives contradictory messages from the coordinator - sending both 
Rollback and Commit (in either order). That would seem to be no 
different from any of the other InvalidState circumstances = "I am 
receiving messages that should not happen in the state I am now in - 
either you have sent a message you shouldn't have done or I've made a
      
  
state transition I shouldn't have done".
 
Receiving InvalidState should certainly cause an alert - but it's a 
pretty serious one, because someone isn't conformant - the parties 
aren't talking WS-AT any more.
 
InconsistentInternalState could be used in other circumstances, 
aligned with its definition. It might even appear in the state table
- perhaps as action triggered from an internal event (which currently
      
  
appears as N/A, curiously)
 
 
Peter
 

---------------------------------------------------------------------
---
*From:* Ram Jeyaraman [mailto:Ram.Jeyaraman@microsoft.com]
*Sent:* 06 May 2006 01:42
*To:* ws-tx@lists.oasis-open.org
*Subject:* RE: [ws-tx] Issue 041 - WS-AT: Invalid events should not 
cause defined transitions

As a consumer of a fault, I would rather receive a more specific 
fault such as InconsistentInternalState, since it offer more specific
      
  
information and helps distinguish from other possible error states.
Specifically, upon receipt of an InconsistentInternalState fault, the
      
  
consumer may send an alert containing the specific cause, which is 
otherwise not possible, if it receives a more generic fault.

 

Why should this fault be removed?

 

---------------------------------------------------------------------
---

*From:* Ram Jeyaraman [mailto:Ram.Jeyaraman@microsoft.com]
*Sent:* Tuesday, March 28, 2006 10:25 AM
*To:* ws-tx@lists.oasis-open.org
*Subject:* [ws-tx] Issue 041 - WS-AT: Invalid events should not cause
      
  
defined transitions

 

This is identified as WS-TX issue 041.

 

Please ensure follow-ups have a subject line starting "Issue 041 -
WS-AT: Invalid events should not cause defined transitions".

 

---------------------------------------------------------------------
---

*From:* Peter Furniss [mailto:peter.furniss@erebor.co.uk]
*Sent:* Monday, March 27, 2006 1:33 PM
*To:* ws-tx@lists.oasis-open.org
*Subject:* [ws-tx] New issue: WS-AT: Invalid events should not cause 
defined transitions

 

Issue name -- WS-AT: Invalid events should not cause defined 
transitions

 

PLEASE DO NOT REPLY TO THIS EMAIL OR START A DISCUSSISON THREAD UNTIL
      
  
THE ISSUE IS ASSIGNED A NUMBER.

 

The issues coordinators will notify the list when that has occurred.

 

Target document and draft:

 

Protocol:  WS-AT

 

Artifact:  spec

 

Draft:

 

AT spec cd 1

 

Link to the document referenced:

 

http://www.oasis-open.org/committees/download.php/17311/wstx-wscoor-1
.1-spec-cd-01.pdf

http://www.oasis-open.org/committees/download.php/17325/wstx-wsat-1.1
-spec-cd-01.pdf


 

Section and PDF line number:

 

ws-at section 10, lines 503/505
 coordinator table: Committed/Active, Committed/Preparing  
pariticipant table: Commit/Active, Commit/Preparing
ws-at: seciton 6.1, line 371

 

Issue type:

 

Design/Editorial

 


Related issues:

 


Issue Description:

 

The receipt of a message when the receiver is in a state such that 
the event cannot occur between correct implementations should not 
cause a state transition and allow the transaction to complete 
"successfully".

 

There is no need to distinguish "InvalidState" and 
"InconsistentInternalState".

 

Issue Details

 

Background

 

InvalidState is defined in WS-Coordinator as being an unrecoverable 
condition, and in all the cases  where it is a defined response in 
the WS-AT tables can only occur if one of the implementations is 
broken/bugged (apart than the volatile Prepared/None case, see 
separate issue).  Providing a defined state transition, as if the 
circumstance were expected and could be recovered from is 
inappropriate.  There can be no graceful completion of the protocol -
      
  
it has gone fundamentally wrong. This does not preclude an 
implementation from attempting to tidy up and protecting its own 
resources, but there should be no required state transition for the 
implementation. The protocol exchange has gone off the map.

 

The use of InconsistentInternalState to distinguish two cases where 
an invalid event occurs is unnecessary (and the definition in line
371 does not align with the use in the table - it is probably the 
coordinator that has been sending wrong messages).
 

The use of InvalidState is appropriate in all cases.

 

Proposed resolution

 

The clearest solution would be to make invalid cells in the state 
tables empty, for the cells currently shown as InvalidState or 
InconsistentInternalState, and also for the N/A cells and explain 
this with text:

 

 "Where a cell is shown as empty
    - if the row is for an Inbound Event, an WS-C Invalid State fault
      
  
should be returned. The subsequent behaviour of the implementation is
      
  
undefined.
    - if the row is for an Internal Event, event cannot occur in this
      
  
state. A TM should view these occurences as serious internal 
consistency issues."

 

Having invalid cells empty makes it significantly easier to read and 
check the state tables. It becomes much clearer that they are 
essentially "sparse" and the path through the table can be followed 
more easily.

 

      
  

2006-05-15.Participant.definition.doc

References:
- RE: re issue 048 - state table (was RE: [ws-tx] Issue 041 - WS-AT: Invalid events should not cause defined transitions)
  - From: "Peter Furniss" <peter.furniss@erebor.co.uk>