I've been considering Peter's points below, questioning the PV state
table's meaning/correctness. Here's where I've got to: it's
complicated, and I may be missing something, so critical review would
be very welcome.
1. The model of the PV state table implicitly contains logical entities
in an overall participant: the PV state machine (SM), a recording
entity (e.g. transaction logging module) (RE), and an application
entity (AE) that carries out application work (details undefined) in
reaction to signals from the SM, prompted by messages from the
Coordinator arriving at the SM.
Defining such entities or roles makes it much easier to describe what
the PV actions and events mean. Attached Word doc picture helps explain
this vocabulary.
Note that AE may be, e.g., a two-phase aware RM or an interposition
participant/sub-coordinator: the details of its work are
encapsulated/irrelevant to the coordination protocol. This is a more
abstract, general way of thinking about Peter's "bound data"
(application state changes induced by the transaction protocol).
2. The operations used below are:
None = move to state None
SA = send Aborted to Coordinator
SC = send Committed to Coordinator
IR = send Initiate Rollback signal to AE
IR-return = get notification of AE rollback completion (good or bad)
IC/ICR = the same for commit (these are internal events Initiate
Commit Decision and Commit Decision for the Prepared Success/Committing
columns)
F = send start forgetting signal to RE, start logical log record
delete (equals action Forget)
F-return = receive forgotten signal from RE, log record deleted, or =
receive forget attempt failed from RE, log record delete failed (this
is my interpretation of event All Forgotten in columns Committing and
Aborting).
3. The fastest safe sequence in reaction to receipt of Commit by PV SM
is:
IC, ICR, None, SC, F.
The spec says
IC, ICR, SC, F, F-return, None.
4. The fastest safe sequence in reaction to receipt of Rollback by PV
SM is:
None, SA, IR, IR-return, F.
The spec (assuming certain plausible understandings) says:
IR, [no wait for IR-return], SA, F, F-return,
None.
I think this unsafe (wait for IR-return is needed). Detailed
explanation of this at bottom of mail.
In neither case is it necessary to block on the attempt to log delete
(to wait for F-return) before sending the acknowledgement message to C.
This design assumption of the current PV table is correct (safe), if
less conservative than Gray/Reuter.
But and therefore, the state transition to None can occur earlier
(after the IR-return or the F), and the entries in the row All
Forgotten, columns Aborting and Committing are unnecessary: this would
obviate the overload of the term All Forgotten which Peter pointed out.
I think this highlights a) a potential bug (need to block IR), b) a
deoptimization/over-complication (could move to None sooner), and c) a
definite need as per this issue, to clearly and unambiguously define
the various actions and events to ensure that they are understood
correctly. Defining and distinguishing the roles of SM, AE and RE would
greatly help.
Alastair
* * *
Detailed explanation of perceived bug. Critical examination very
much welcomed.
Premise: job of the PV machine is to ensure that the outcome decision
always reaches the AE irrespective of C or P failures.
Note: If action "Initiate Rollback" means "start rollback and wait for
the rollback operation to complete" then there is no bug. However, the
clear separation of Initiate Commit Decision and Commit Decision
(commit processing finished) in the commit case leads me to believe
that Initiate Rollback is not a misnomer, but an intentional name.
Currently the spec says:
IR, [no wait for IR-return], SA, F, F-block,
None.
If rollback processing starts (but has not completed), Aborted is sent
to and received by the Coordinator, the log is deleted, and the
participant crashes; the AE will revert to its prepared state
(unwritten premise), and will be orphaned. C will never contact P
again, P will not recover, and the rollback work of AE will never have
been effected.
If we substitute:
None, SA, IR, IR-return, F.
then a crash between IR and IR-return will lead the AE to revert to its
prepared state, the SM to recover in Prepared Success, it will send
Prepared/Replay to the Coordinator, which in state None leads to C
sending Rollback to P, which leads to re-execution of rollback
processing, as expected.
Possible fixes
A) Introduce an explicit wait-for-return from the AE in the rollback
case. To do this we could circuit through the Aborting column, in the
same way that transit to Committing via ICR (Commit Decision) is used
to introduce
blocking on the completion of AE work for the commit case. This would
require introduction of a Rollback Completed event (parallel to Commit
Decision event).
B) We could call Initiate Rollback: "Perform Rollback", and we could
call Initiate Commit Decision: "Perform Commit", define them both as
blocking operations within the AE, and remove the Aborting and
Committing states, as I suggested might be possible.
Prepared Success: Commit -- action: Perform Commit, Send
Committed, Forget; transit state to None
Prepared Success: Rollback -- action, Send Aborted, Perform Rollback,
Forget; transit state to None.
If we define Forget as non-blocking: "initiate the process of
forgetting; initiate logical log delete, but do not wait for status"
then we retain the existing optimization (not requiring wait for log
delete before the SM communicates with the Coordinator, enabling it in
turn to close the overall transaction more quickly).
These two fixes cause the transaction only to finally complete (be
forgotten by the C) when all rollback work has been performed. The cost
is that the demarcating application will receive a delayed notification
of total completion. However, it can still receive a timely
notification of the outcome decision. In the absence of
heuristic reporting this is the likely point of completion
notification, so I see no real cost in these fixes.
C) Introduce a back call from the AE to the SM. If an AE recovers it
can communicate with SM (perhaps an event like "AE Recovered") and if
the state of SM is None (unknown transaction) then the SM can
reinstruct rollback. If the SM is still in Prepared Success then this
signal will lead SM to communicate the outcome to the AE in the normal
way. This assumes that AE and SM are capable of independent failure and
recovery. If they fail together then it is likely the log delete and AE
work were bound atomically (which is not prohibited by anything
discussed here).
Without such a back-call the spec defines a PV machine that does not
reliably deliver the outcome in all circumstances to its AE -- which is
its (unstated) purpose.
The disadvantage of this fix is that it assumes that the AE can take
the initiative. The entity we can assume recovers is the SM,
so I prefer approach A) or B).
Note i): I have assumed that outcome of the blocking operations Perform
Rollback | Commit (success or failure) is irrelevant (no heuristics),
though completion is relevant.
Note ii): I have assumed that failure to log delete is of no concern to
the protocol (creates an OOB notification need, related to the
potential for leaving safe garbage around).
Note iii): I can't find the words "presumed abort" anywhere in the AT
spec. Is that intentional?
Peter Furniss wrote:
Further interleaving (tagged
<prf>)
Alastair Green sent:
That definition is sound for
Commit Decision in Preparing state, but Commit Decision is also valid
in the PV table for state Committing. On that occasion, we deduce from
the action (send Committed and Forget) it means "I have applied the
instructed commit action to my data/resources". (note also the action
on receiving Commit in PreparedSuccess state, which is "Initiate Commit
Decision".
I think this is a mistake in the state table. The action for
Commit/Prepared Success should be Send Committed and Forget (no need to
indirect through Commit Decision).
<prf>
that could sort of work if the Participant is
representing just a leaf in the transaction and we aren't trying to
model the bound data (i.e. the resource that is subject to the outcome
of the transaction). But the Participant View has to deal with the
top-half of an interposed sub-coordinator as well - that must wait for
the grand-child participants to receive and respond to the commit
signal. (Actually the same will normally occur if the resource is, say,
a database supporting XA - the participant state entity has to relay
the commit semantic onwards)
</prf>
Your suggested definition
for "All Forgotten" is valid for its use in Active state, but it also
appears in Committing or Aborting. There it would mean "log record has
been (logically) deleted" - it's the only way to get back to None.
This is a related mistake. Forget equals logical delete of log
record. Therefore, if the P receives Commit in Prepared Success, it
should send Committed, forget its involvement, and move directly to
None. Committing (for the P) is a meaningless state. Once in None a
replay of Commit will cause Committed to be returned. A second Commit
received before the state transition to None has completed should be
queued until the state/action atom has completed.
I think the same applies to Rollback processing.
At first glance (haven't checked this thoroughly) it appears that the
columns Committing and Aborting (these states) are simply not needed
for the PV tables.
Sounds like there is a prima facie case for a new issue or issues. It
also seems clear that these quirks are about rendering an understood
and agreed design intent correctly (i.e they do not affect interop).
<prf>
If there are enough mistakes in or alternative
interpretations of the state tables, then they don't really mean
anything, and the "understood and agreed design intent" is just the
text in the rest of the spec plus any implicitly assumed background
of transaction theory and practice.
actually, I'm not sure one of the quirks doesn't
affect interop, and could threaten transaction consistency - and your
proposed change would make it worse. The state tables require the
sending of Committed *before* the prepared log is deleted - it is
normal in such protocols to require the delete to happen first.
take a Participant that is directly responsible
for bound data - which means the prepared log will also be responsible
for holding the prepared condition on that data. If that crashes in
PreparedSuccess (i.e. with a log record), when the system recovers it
will re-impose the prepared (e.g. locked) condition - this is precisely
when the bottom-up recovery takes place. If the coordinator had in fact
committed at the time of the crash, the coordinator will reply with a
(repeat) Commit, the commit will be applied and all is good.
the same applies if the crash occurs after the
(first) Commit has been received, but not applied (i.e. the log record
is still there). Since the commit action wasn't complete, the
Participant won't have sent Committed, and the Coordinator will still
be in Committing - it will still have a log record, and all is good.
However, if Committed is sent *before* the
commit is fully applied *including the logical deletion of the prepared
log record", then crash just after sending Committed will leave the
Participant in PreparedSuccess (the state it reverts to on recovery
with a log record present) but the Coordinator receives the Committed,
forgets and transits to None. On recovery the participant now sends
Prepared/Replay, which is received by the Coordinator in state None -
which returns a rollback semantic (either Rollback or
UnknownTransaction, depending on another issue - the effect on the
Participant is the same).
we've now got a participant that applies
rollback when the transaction committed.
it might be possible to work round this
by defining the Commit Decision event for a Participant View in state
Committing as meaning "all bound data have been committed and won't
revert to prepared in the event of failure" and that Forget is just
"delete log record which mustn't have any effect on bound data", but
that's getting rather hairy - and certainly is forcing implementations
to work in a particular way.
(In Gray and Reuter, 10.4.2.2 the sequence for a participant
is:
commit (tell resource managers to commit)
complete (when all rm's have replied, write
completion record [which I've phrased as "delete prepared log", out of
old habit]
acknowledge "When the completion record is
durable, send acknowledgement .. to coordinator"
commit, acknowledge, complete is not safe.
Unless someone can come up with some neat
wording for the events that makes it clear we aren't allowing "commit -
acknowledge - complete", this will have to be another issue (since
there aren't any against the participant table as such at the moment)
Peter
(no further comments in the message)
Surely the meaning of internal events
shouldn't depend on the state when it occurs ?
I think that's right.
Peter
Alastair
Commit Decision in the PV means: the participant inquired of itself (of
the "application" that is driven by the state machine) whether it was
ready to prepare, rollback or send read-only. Commit Decision = the
participant has answered its own enquiry with "yes, I want to prepare".
The least comprehensible in the PV is "All Forgotten" which means: I am
going to send read-only and then forget this transaction. It might be
easier to think of this as "Read Only Decision".
Alastair
Peter Furniss wrote:
Alastair asked
"Peter, I don't understand how a message like this could be used to
respond to internal events. To whom would it be delivered?"
I was imprecise - it wouldn't actually be sent as a result of the
internal event, but a currently illegal internal event (e.g. Rollback
Decision in PreparedSuccess state) would cause a transition to Aborting
(or a variant of Aborting), InconsistentInternalState would then be an
appropriate response to Commit. (Similarly for Commit Decision going to
Committing state [1])
However, that would be a heuristic decision and heuristic report, which
is out of charter.
Peter
([1]: what is Commit Decision supposed to mean for the participant side
- it surely can't mean the same thing when it occurs in Preparing as it
does in Committing. - this is really a 048 question)
-----Original Message-----
From: Alastair Green [mailto:alastair.green@choreology.com]
Sent: 12 May 2006 10:27
To: Mark Little
Cc: Peter Furniss; Ram Jeyaraman; ws-tx@lists.oasis-open.org
Subject: Re: [ws-tx] Issue 041 - WS-AT: Invalid events should not cause
defined transitions
I think it's pretty obvious that these two messages are intended for the
same purpose (protocol error; non-conformant counterpart, all bets are
off).
I would like to see an explanation from the original author companies
for this duplication, and a proper argument that it is not redundancy.
Without that it seems very clear the AT message should go.
I also agree that no state transition should follow a protocol error,
i.e. the approach taken in Peter's sparse + text solution is correct.
Sparse versus verbose is a stylistic question.
Peter, I don't understand how a message like this could be used to
respond to internal events. To whom would it be delivered?
Alastair
Mark Little wrote:
+1
Peter Furniss wrote:
I agree that distinguishing circumstances of faults is generally a
good thing. Equally, one can also have too much of a good thing :-)
But the problem with InconsistentInternalState is that the definition
in the text doesn't correspond with the use in the state table.
Definition says its when the participant cannot fulfil its
obligations. That presumably would be apply when a participant has
gone prepared but now cannot obey the Commit or Rollback it receives
(which sounds suspiciously like a heuristic warning which would be
out of charter for this TC).
But the use in the state tables is that Participant sends it when it
receives contradictory messages from the coordinator - sending both
Rollback and Commit (in either order). That would seem to be no
different from any of the other InvalidState circumstances = "I am
receiving messages that should not happen in the state I am now in -
either you have sent a message you shouldn't have done or I've made a
state transition I shouldn't have done".
Receiving InvalidState should certainly cause an alert - but it's a
pretty serious one, because someone isn't conformant - the parties
aren't talking WS-AT any more.
InconsistentInternalState could be used in other circumstances,
aligned with its definition. It might even appear in the state table
- perhaps as action triggered from an internal event (which currently
appears as N/A, curiously)
Peter
---------------------------------------------------------------------
---
*From:* Ram Jeyaraman [mailto:Ram.Jeyaraman@microsoft.com]
*Sent:* 06 May 2006 01:42
*To:* ws-tx@lists.oasis-open.org
*Subject:* RE: [ws-tx] Issue 041 - WS-AT: Invalid events should not
cause defined transitions
As a consumer of a fault, I would rather receive a more specific
fault such as InconsistentInternalState, since it offer more specific
information and helps distinguish from other possible error states.
Specifically, upon receipt of an InconsistentInternalState fault, the
consumer may send an alert containing the specific cause, which is
otherwise not possible, if it receives a more generic fault.
Why should this fault be removed?
---------------------------------------------------------------------
---
*From:* Ram Jeyaraman [mailto:Ram.Jeyaraman@microsoft.com]
*Sent:* Tuesday, March 28, 2006 10:25 AM
*To:* ws-tx@lists.oasis-open.org
*Subject:* [ws-tx] Issue 041 - WS-AT: Invalid events should not cause
defined transitions
This is identified as WS-TX issue 041.
Please ensure follow-ups have a subject line starting "Issue 041 -
WS-AT: Invalid events should not cause defined transitions".
---------------------------------------------------------------------
---
*From:* Peter Furniss [mailto:peter.furniss@erebor.co.uk]
*Sent:* Monday, March 27, 2006 1:33 PM
*To:* ws-tx@lists.oasis-open.org
*Subject:* [ws-tx] New issue: WS-AT: Invalid events should not cause
defined transitions
Issue name -- WS-AT: Invalid events should not cause defined
transitions
PLEASE DO NOT REPLY TO THIS EMAIL OR START A DISCUSSISON THREAD UNTIL
THE ISSUE IS ASSIGNED A NUMBER.
The issues coordinators will notify the list when that has occurred.
Target document and draft:
Protocol: WS-AT
Artifact: spec
Draft:
AT spec cd 1
Link to the document referenced:
http://www.oasis-open.org/committees/download.php/17311/wstx-wscoor-1
.1-spec-cd-01.pdf
http://www.oasis-open.org/committees/download.php/17325/wstx-wsat-1.1
-spec-cd-01.pdf
Section and PDF line number:
ws-at section 10, lines 503/505
coordinator table: Committed/Active, Committed/Preparing
pariticipant table: Commit/Active, Commit/Preparing
ws-at: seciton 6.1, line 371
Issue type:
Design/Editorial
Related issues:
Issue Description:
The receipt of a message when the receiver is in a state such that
the event cannot occur between correct implementations should not
cause a state transition and allow the transaction to complete
"successfully".
There is no need to distinguish "InvalidState" and
"InconsistentInternalState".
Issue Details
Background
InvalidState is defined in WS-Coordinator as being an unrecoverable
condition, and in all the cases where it is a defined response in
the WS-AT tables can only occur if one of the implementations is
broken/bugged (apart than the volatile Prepared/None case, see
separate issue). Providing a defined state transition, as if the
circumstance were expected and could be recovered from is
inappropriate. There can be no graceful completion of the protocol -
it has gone fundamentally wrong. This does not preclude an
implementation from attempting to tidy up and protecting its own
resources, but there should be no required state transition for the
implementation. The protocol exchange has gone off the map.
The use of InconsistentInternalState to distinguish two cases where
an invalid event occurs is unnecessary (and the definition in line
371 does not align with the use in the table - it is probably the
coordinator that has been sending wrong messages).
The use of InvalidState is appropriate in all cases.
Proposed resolution
The clearest solution would be to make invalid cells in the state
tables empty, for the cells currently shown as InvalidState or
InconsistentInternalState, and also for the N/A cells and explain
this with text:
"Where a cell is shown as empty
- if the row is for an Inbound Event, an WS-C Invalid State fault
should be returned. The subsequent behaviour of the implementation is
undefined.
- if the row is for an Internal Event, event cannot occur in this
state. A TM should view these occurences as serious internal
consistency issues."
Having invalid cells empty makes it significantly easier to read and
check the state tables. It becomes much clearer that they are
essentially "sparse" and the path through the table can be followed
more easily.
|