Further interleaving (tagged
<prf>)
Alastair Green sent:
That definition is sound for Commit Decision in
Preparing state, but Commit Decision is also valid in the PV table for state
Committing. On that occasion, we deduce from the action (send Committed and
Forget) it means "I have applied the instructed commit action to my
data/resources". (note also the action on receiving Commit in PreparedSuccess
state, which is "Initiate Commit Decision".
I think this is a mistake in the state table. The action for
Commit/Prepared Success should be Send Committed and Forget (no need to indirect
through Commit Decision).
<prf>
that could sort of work if the Participant is representing just a leaf in
the transaction and we aren't trying to model the bound data (i.e. the resource
that is subject to the outcome of the transaction). But the Participant View has
to deal with the top-half of an interposed sub-coordinator as well - that must
wait for the grand-child participants to receive and respond to the commit
signal. (Actually the same will normally occur if the resource is, say, a
database supporting XA - the participant state entity has to relay the commit
semantic onwards)
</prf>
Your suggested definition for "All Forgotten" is
valid for its use in Active state, but it also appears in Committing or
Aborting. There it would mean "log record has been (logically)
deleted" - it's the only way to get back to None.
This is a related mistake. Forget equals logical delete of log record.
Therefore, if the P receives Commit in Prepared Success, it should send
Committed, forget its involvement, and move directly to None. Committing (for
the P) is a meaningless state. Once in None a replay of Commit will cause
Committed to be returned. A second Commit received before the state transition
to None has completed should be queued until the state/action atom has
completed.
I think the same applies to Rollback processing.
At
first glance (haven't checked this thoroughly) it appears that the columns
Committing and Aborting (these states) are simply not needed for the PV tables.
Sounds like there is a prima facie case for a new issue or issues. It
also seems clear that these quirks are about rendering an understood and agreed
design intent correctly (i.e they do not affect interop).
<prf>
If
there are enough mistakes in or alternative interpretations of the state tables,
then they don't really mean anything, and the "understood and agreed design
intent" is just the text in the rest of the spec plus any implicitly assumed
background of transaction theory and practice.
actually, I'm not sure one of the quirks doesn't affect interop, and
could threaten transaction consistency - and your proposed change would make it
worse. The state tables require the sending of Committed *before* the prepared
log is deleted - it is normal in such protocols to require the delete to happen
first.
take a
Participant that is directly responsible for bound data - which means the
prepared log will also be responsible for holding the prepared condition on that
data. If that crashes in PreparedSuccess (i.e. with a log record), when the
system recovers it will re-impose the prepared (e.g. locked) condition - this is
precisely when the bottom-up recovery takes place. If the coordinator had in
fact committed at the time of the crash, the coordinator will reply with a
(repeat) Commit, the commit will be applied and all is
good.
the
same applies if the crash occurs after the (first) Commit has been
received, but not applied (i.e. the log record is still there). Since the commit
action wasn't complete, the Participant won't have sent Committed, and the
Coordinator will still be in Committing - it will still have a log record, and
all is good.
However, if Committed is sent *before* the commit is fully applied
*including the logical deletion of the prepared log record", then crash just
after sending Committed will leave the Participant in PreparedSuccess (the state
it reverts to on recovery with a log record present) but the Coordinator
receives the Committed, forgets and transits to None. On recovery the
participant now sends Prepared/Replay, which is received by the Coordinator in
state None - which returns a rollback semantic (either Rollback or
UnknownTransaction, depending on another issue - the effect on the Participant
is the same).
we've
now got a participant that applies rollback when the transaction committed.
it
might be possible to work round this by defining the Commit Decision event
for a Participant View in state Committing as meaning "all bound data have been
committed and won't revert to prepared in the event of failure" and that Forget
is just "delete log record which mustn't have any effect on bound data", but
that's getting rather hairy - and certainly is forcing implementations to work
in a particular way.
(In Gray and
Reuter, 10.4.2.2 the sequence for a participant is:
commit (tell resource managers to
commit)
complete (when all rm's have replied, write completion
record [which I've phrased as "delete prepared log", out of old
habit]
acknowledge "When the completion record is durable, send
acknowledgement .. to coordinator"
commit, acknowledge, complete is not safe.
Unless
someone can come up with some neat wording for the events that makes it clear we
aren't allowing "commit - acknowledge - complete", this will have to be another
issue (since there aren't any against the participant table as such at the
moment)
Peter
(no
further comments in the message)
Surely the meaning of internal events shouldn't
depend on the state when it occurs ?
I think
that's right.
Peter
Alastair
Commit Decision in the PV means: the
participant inquired of itself (of the "application" that is driven by the
state machine) whether it was ready to prepare, rollback or send read-only.
Commit Decision = the participant has answered its own enquiry with "yes, I
want to prepare".
The least comprehensible in the PV is "All Forgotten"
which means: I am going to send read-only and then forget this transaction. It
might be easier to think of this as "Read Only
Decision".
Alastair
Peter Furniss wrote:
Alastair asked
"Peter, I don't understand how a message like this could be used to
respond to internal events. To whom would it be delivered?"
I was imprecise - it wouldn't actually be sent as a result of the
internal event, but a currently illegal internal event (e.g. Rollback
Decision in PreparedSuccess state) would cause a transition to Aborting
(or a variant of Aborting), InconsistentInternalState would then be an
appropriate response to Commit. (Similarly for Commit Decision going to
Committing state [1])
However, that would be a heuristic decision and heuristic report, which
is out of charter.
Peter
([1]: what is Commit Decision supposed to mean for the participant side
- it surely can't mean the same thing when it occurs in Preparing as it
does in Committing. - this is really a 048 question)
-----Original Message-----
From: Alastair Green [mailto:alastair.green@choreology.com]
Sent: 12 May 2006 10:27
To: Mark Little
Cc: Peter Furniss; Ram Jeyaraman; ws-tx@lists.oasis-open.org
Subject: Re: [ws-tx] Issue 041 - WS-AT: Invalid events should not cause
defined transitions
I think it's pretty obvious that these two messages are intended for the
same purpose (protocol error; non-conformant counterpart, all bets are
off).
I would like to see an explanation from the original author companies
for this duplication, and a proper argument that it is not redundancy.
Without that it seems very clear the AT message should go.
I also agree that no state transition should follow a protocol error,
i.e. the approach taken in Peter's sparse + text solution is correct.
Sparse versus verbose is a stylistic question.
Peter, I don't understand how a message like this could be used to
respond to internal events. To whom would it be delivered?
Alastair
Mark Little wrote:
+1
Peter Furniss wrote:
I agree that distinguishing circumstances of faults is generally a
good thing. Equally, one can also have too much of a good thing :-)
But the problem with InconsistentInternalState is that the definition
in the text doesn't correspond with the use in the state table.
Definition says its when the participant cannot fulfil its
obligations. That presumably would be apply when a participant has
gone prepared but now cannot obey the Commit or Rollback it receives
(which sounds suspiciously like a heuristic warning which would be
out of charter for this TC).
But the use in the state tables is that Participant sends it when it
receives contradictory messages from the coordinator - sending both
Rollback and Commit (in either order). That would seem to be no
different from any of the other InvalidState circumstances = "I am
receiving messages that should not happen in the state I am now in -
either you have sent a message you shouldn't have done or I've made a
state transition I shouldn't have done".
Receiving InvalidState should certainly cause an alert - but it's a
pretty serious one, because someone isn't conformant - the parties
aren't talking WS-AT any more.
InconsistentInternalState could be used in other circumstances,
aligned with its definition. It might even appear in the state table
- perhaps as action triggered from an internal event (which currently
appears as N/A, curiously)
Peter
---------------------------------------------------------------------
---
*From:* Ram Jeyaraman [mailto:Ram.Jeyaraman@microsoft.com]
*Sent:* 06 May 2006 01:42
*To:* ws-tx@lists.oasis-open.org
*Subject:* RE: [ws-tx] Issue 041 - WS-AT: Invalid events should not
cause defined transitions
As a consumer of a fault, I would rather receive a more specific
fault such as InconsistentInternalState, since it offer more specific
information and helps distinguish from other possible error states.
Specifically, upon receipt of an InconsistentInternalState fault, the
consumer may send an alert containing the specific cause, which is
otherwise not possible, if it receives a more generic fault.
Why should this fault be removed?
---------------------------------------------------------------------
---
*From:* Ram Jeyaraman [mailto:Ram.Jeyaraman@microsoft.com]
*Sent:* Tuesday, March 28, 2006 10:25 AM
*To:* ws-tx@lists.oasis-open.org
*Subject:* [ws-tx] Issue 041 - WS-AT: Invalid events should not cause
defined transitions
This is identified as WS-TX issue 041.
Please ensure follow-ups have a subject line starting "Issue 041 -
WS-AT: Invalid events should not cause defined transitions".
---------------------------------------------------------------------
---
*From:* Peter Furniss [mailto:peter.furniss@erebor.co.uk]
*Sent:* Monday, March 27, 2006 1:33 PM
*To:* ws-tx@lists.oasis-open.org
*Subject:* [ws-tx] New issue: WS-AT: Invalid events should not cause
defined transitions
Issue name -- WS-AT: Invalid events should not cause defined
transitions
PLEASE DO NOT REPLY TO THIS EMAIL OR START A DISCUSSISON THREAD UNTIL
THE ISSUE IS ASSIGNED A NUMBER.
The issues coordinators will notify the list when that has occurred.
Target document and draft:
Protocol: WS-AT
Artifact: spec
Draft:
AT spec cd 1
Link to the document referenced:
http://www.oasis-open.org/committees/download.php/17311/wstx-wscoor-1
.1-spec-cd-01.pdf
http://www.oasis-open.org/committees/download.php/17325/wstx-wsat-1.1
-spec-cd-01.pdf
Section and PDF line number:
ws-at section 10, lines 503/505
coordinator table: Committed/Active, Committed/Preparing
pariticipant table: Commit/Active, Commit/Preparing
ws-at: seciton 6.1, line 371
Issue type:
Design/Editorial
Related issues:
Issue Description:
The receipt of a message when the receiver is in a state such that
the event cannot occur between correct implementations should not
cause a state transition and allow the transaction to complete
"successfully".
There is no need to distinguish "InvalidState" and
"InconsistentInternalState".
Issue Details
Background
InvalidState is defined in WS-Coordinator as being an unrecoverable
condition, and in all the cases where it is a defined response in
the WS-AT tables can only occur if one of the implementations is
broken/bugged (apart than the volatile Prepared/None case, see
separate issue). Providing a defined state transition, as if the
circumstance were expected and could be recovered from is
inappropriate. There can be no graceful completion of the protocol -
it has gone fundamentally wrong. This does not preclude an
implementation from attempting to tidy up and protecting its own
resources, but there should be no required state transition for the
implementation. The protocol exchange has gone off the map.
The use of InconsistentInternalState to distinguish two cases where
an invalid event occurs is unnecessary (and the definition in line
371 does not align with the use in the table - it is probably the
coordinator that has been sending wrong messages).
The use of InvalidState is appropriate in all cases.
Proposed resolution
The clearest solution would be to make invalid cells in the state
tables empty, for the cells currently shown as InvalidState or
InconsistentInternalState, and also for the N/A cells and explain
this with text:
"Where a cell is shown as empty
- if the row is for an Inbound Event, an WS-C Invalid State fault
should be returned. The subsequent behaviour of the implementation is
undefined.
- if the row is for an Internal Event, event cannot occur in this
state. A TM should view these occurences as serious internal
consistency issues."
Having invalid cells empty makes it significantly easier to read and
check the state tables. It becomes much clearer that they are
essentially "sparse" and the path through the table can be followed
more easily.