-
Register and RegisterResponse rows are removed (resolution of 039, agreed last
conf call)
-
Prepared column in CV removed (resolution of 062, Dublin)
-
for Prepared/None, Volatile will send Unknown Transaction fault (Durable
unchanged - Send Rollback)
(from end of
discussion on 039, Dublin)
And we are agreed
(as the text said) that the tables are for a single partner, with independent
states but (some of) the internal events occurring for all
partners.
However, there are
several problems with the state tables which causes them to be mis-aligned with
the text and our intent. A strict implementation of the state tables would not
deliver the correct external behaviour. The following sets out the problems and
proposes solutions. I believe they reflect what implementations are in fact
doing - this is an alignment with reality and a statement of what was meant all
along.
Problem
A) Coordinator: receiving Aborted from one Participant when in Active
or Preparing should cause the coordinator to initiate rollback to all. But the
state table just causes that one CV to initiate action Forget and transit
to Aborting. The only internal event in the CV table which would send
Rollback to the other participants at that point is Expires times out.
(this used not be the case, since User Rollback did to). The Participant
table does have an action "Initiate Rollback" which is what we
need.
Proposal
1 : add an internal event to CV table "Initiate Rollback", and add
this as an action in the Aborted/Active and Aborted/Preparing
cells.
the internal event
Initiate Rollback causes Send Rollback, Aborting in Active and Preparing states,
and is ignored in Aborting, can't happen in Committing.
(it is an
interpretation rule that the actions and state transitions of a cell complete
before any initiated internal events occur - consequently the CV state for
the participant from which Aborted was received will be in Aborting when
the Initiate Rollback is actioned, and will not be sent Rollback, which aligns
with the state diagram.)
A stylistic change
would then be to make Expires Times Out use Inititate Rollback, rather
than do the work for itself (since Expires Times out is "node-wide", it could be
left as it is, but it's generally neater to reuse something that is already
there)
Proposal
2: change the Expires Times Out/Active and /Preparing to
action Initiate Rollback, staying in the same state.
Problem
B) Coordinator: receiving ReadOnly from a Participant should end the
protocol exchanges. However, the table currently has the state staying in
Active. Consequently, if User Commit or User Rollback are received, the table
says Prepare or Rollback would be sent. ReadOnly needs to move the state
to one where User Commit and User Rollback are ignored. (this was mentioned but not finalised in
Dublin)
A related problem is
the two User actions are not permitted in Aborting - which is correct since we
would not allow duplicate User Rollback, but incorrect if the transition to
Aborting was caused by the spontaneous arrival of Aborted in Active
state.
Proposal
3: Add a state "Forgetting", which is entered by all cells that
have Forget as an action. All incoming messages have action Ignore, except for
Prepared, and User Commit, User Rollback, Initiate Rollback, Expires Times Out
are all ignored. All Forgotten causes transition to None. receive
Prepared/Forgetting has action (re)Send Rollback.
Problem
C) The Prepared/Aborting cell should not initiate Forget (nor
transit to Forgetting) - a participant that has already gone prepared should not
be forgotten until we get a response to the Rollback. Admittedly, since we are
using presume abort that's not entirely true, and we could forget about everyone
as soon as rollbacks are sent. But that isn't the way the rest of the table was
or is working (e.g. User Rollback/Active has "Send Rollback", not "Send Rollbak,
Forget") - so this is really an inconsistency, not a
problem.
Proposal
4: Make the action in Prepared/Aborting to be just Resend
Rollback, and keep the state in Aborting.
Problem
D) All forgotten and transition to None. There are three rather
different kinds of event that would seem to correspond to transition to state
None
a) the knowledge of this particular participant has gone as a result of the
Forget action, but others may remain
b) the knowledge of all participants has gone as a result of Forget for all of
them
c) the knowledge of this participant (and some others) has gone as a result of a
crash (when there was no persistent information for it)
The "All Forgotten"
action is presumably named for event b), or perhaps including knowledge loss
after crash, but actually its a) and c) we should be interested in. b) is a bit
tricky because of the volatile/durable distinction.
By separating out
the Forgetting state as in proposal 3, a) will now only occur from state
Forgetting, and we needn't be concerned about what is happening to other state
machines (to do so would be to impose on implementation,
anyway).
For event c), the
behaviour should be different for durable and volatile.
If we have separate
events for the deliberate completion of Forget, and the unintended loss of
information in crash, we can capture the logging requirement without imposing at
all on how it is actually implemented.
Proposal
5: Add a new internal event "Forget Completed", permitted only in
state Forgetting, where it causes a transition to None.
Proposal
6: Change the "All Forgotten" row to have transitions to None from
all states except Committing, where it is disallowed for Durable, but
transitions to None for Volatile.
(note that All
Forgotten is an event imposed on an implementation rather than done on
purpose)
attached is an
version of the table with these changes, colour marked to distinguish them. I've
followed my "sparse" convention - invalid or n/a cells are empty, staying in the
same state is represented by *, if there is no action and just a *, "Ignore" is
implicit (these make it a lot easier to work on the table - re-inflating it back
to the explicit form is just editorial). The pale yellow cells are
affected by other issues that I'm not dealing with here.
I've deliberately
not gone into detail of what the internal events mean, as that is also another
issue. I think this version of the tables will allow self-consistent
definitions, though some of them are a bit tangled in relation to the
volatile/durable distinction - some of the internal events are different things
for volatile and durable partners, but given that the events are independent of
what state the state machine is in.
Peter