OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

ws-tx message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [ws-tx] RE: Issue 048 (was RE: [ws-tx] Issue 052 - WS-AT: Replaymessage generates protocol errors )


Ram, Peter:

The comment:
"Clearly, there are assumptions and internal events that are not currently captured in the state tables, because they are hard to express in a state table."
is an argument for finding a way of stating the assumptions and missing internal events. That is why this issue was raised: to propose that the spec contain definitions of all the events and actions used in the state tables. A combination of such definitions and (possibly modified) state tables should be able to produce unambiguous and complete specification of behaviours.

At the inaugural meeting of the TC the input authors clearly stated that the tables were intended to be normative. If they have holes or omissions, relevant to the design intent, then it is our job to fill in the gaps. If the tables are illustrative and incomplete then implementing the specifications becomes impossible without "insider information" (unwritten rules and knowledge residing somewhere in or between the original author companies). And there is no guarantee that one company's inside understanding is in fact the same as another's.

The difference of interpretation relating to "lazy log delete" exposed in the discussion on Unknown Transaction/None is a classic example of what happens when terms are not defined, and meanings are assumed.

Defining the actions and events is a good starting point for establishing whether and where new rows or columns need to be added (or indeed, old ones subtracted).

Beyond these general points, I agree with Peter that the volatile/durable area is particularly underspecified.

Alastair

Peter Furniss wrote:
Ram,

Are you sure about the sentence "The state table attempts to portray the
high-level flow and is illustrative."?  Surely you aren't saying the
state tables are not normative, but are just examples like
WS-Coordination section 3 ?

State tables are a powerful means of expressing normative requirements
in a protocol like WS-AT. Obviously they need to be bug free. (other
formalisms may be more powerful still, but tend not to be generally
intelligible - or perhaps I just like state tables :-)  It would seem a
shame to drop them to an informative level. Even then it would be
essential to sort out and make clear what the states and events
represent, and what the coverage is (e.g. are they meant to cover the
interactions of the different protocols in ws-at)

Contrary to your first sentence, I don't think the assumption and
internal events are necessarily hard to express in a state table, but
the table does have to be structured right. I still think this is most
easily done by separating the "B-coordinator" (cooordinator view over
all participants in one transaction) and the "C-coordinator"
(relationship to one participant) states, as in the proposal I put in on
issue 039.  It may be possible to rework the current coordinator table,
changing entries and adding new states and events as needed, but I
suspect the event definitions will end up more complicated and the whole
harder to understand.  But the present tables are, I think we agree,
need changing to some extent at least.


Peter



-----Original Message-----
From: Ram Jeyaraman [mailto:Ram.Jeyaraman@microsoft.com] 
Sent: 19 May 2006 02:50
To: Peter Furniss; ws-tx@lists.oasis-open.org
Subject: RE: [ws-tx] RE: Issue 048 (was RE: [ws-tx] Issue 052 - WS-AT:
Replay message generates protocol errors )

Peter,

Clearly, there are assumptions and internal events that are not
currently captured in the state tables, because they are hard to express
in a state table. The state table attempts to portray the high-level
flow and is illustrative.

I believe that the tables adequately describe most of the significant
transitions, except perhaps for a few bugs (like the All
Forgotten/Active cell) which you pointed out earlier.

-----Original Message-----
From: Peter Furniss [mailto:peter.furniss@erebor.co.uk]
Sent: Thursday, May 18, 2006 1:59 AM
To: Ram Jeyaraman; ws-tx@lists.oasis-open.org
Subject: [ws-tx] RE: Issue 048 (was RE: [ws-tx] Issue 052 - WS-AT:
Replay message generates protocol errors )

Ram,

Comments interleaved, with prefix PRF: 

-----Original Message-----
From: Ram Jeyaraman [mailto:Ram.Jeyaraman@microsoft.com]
Sent: 18 May 2006 02:14
To: Peter Furniss; ws-tx@lists.oasis-open.org
Subject: RE: Issue 048 (was RE: [ws-tx] Issue 052 - WS-AT: Replay
message generates protocol errors )

Peter,

  
But that doesn't help sort out (either way) whether it concerns one
    
participant, multiple participants that are registered for one
transaction, or all the participants that are registered for all
transactions. All would be subject to events "local to a site" (though
the last would definitely not need to dealt with as such in this spec)

One way to think about this is to consider internal events as being
generated by a TM core for each transaction, but these events are
applied on each individual participant  state machine (on a
per-participant basis), as illustrated in the CV and PV state tables.

PRF: I think it is likely that was the general meaning - that internal
events were "B-coordinator" events. It works for some things, but gets
into complications for the separate volatile and durable preparing
waves.  Comms Times Out would be individual.

  
If the states are independent for each participant, how does the
    
coordinator view state for a participant that went readonly ever get out
of Active ?

Yes, the "All Forgotten" row for "Active" column should be "None"
instead of "Active".

PRF: no that won't work either - User Commit, which would apply to this
machine too, would then send Prepare to that participant.  And if the
ReadOnly is in response to prepare, we will send Commit to the
participant.


Peter


-----Original Message-----
From: Peter Furniss [mailto:peter.furniss@erebor.co.uk]
Sent: Thursday, May 11, 2006 12:14 AM
To: Ram Jeyaraman; ws-tx@lists.oasis-open.org
Subject: Re: Issue 048 (was RE: [ws-tx] Issue 052 - WS-AT: Replay
message generates protocol errors )

Ram,

(I've changed the subject line, since I don't think this is really about
Replay specifically, but more generally about the state tables. I could
have made it issue 036 instead.)


I'm not certain how to read your first sentence, which is the key one.
That something is "local to a site" I take to mean that it is internal
to the particular installation of some software, which is certainly
true. But that doesn't help sort out (either way) whether it concerns
one participant, multiple participants that are registered for one
transaction, or all the participants that are registered for all
transactions.  All would be subject to events "local to a site" (though
the last would definitely not need to dealt with as such in this spec)


If the states are independent for each participant, how does the
coordinator view state for a participant that went readonly ever get out
of Active ? 

On your understanding of the state tables, do they mandate the behaviour
required in interop scenario 3.1 ? That has a volatile participant going
prepared, then a durable sending Aborted, causing the coordinator to
sent Rollback to the volatile. What event occurred on the coordinator
view for the volatile that caused it to send rollback ?  Why (from the
tables) was the Prepare to the durable delayed until Prepared was
received from the volatile ?

Peter

-----Original Message-----
From: Ram Jeyaraman [mailto:Ram.Jeyaraman@microsoft.com]
Sent: 11 May 2006 03:17
To: Peter Furniss; ws-tx@lists.oasis-open.org
Subject: RE: [ws-tx] Issue 052 - WS-AT: Replay message generates
protocol errors 

Peter,

Internal events, because they are internal, are local to a site, and do
not imply states relating to one or more participants. Inbound events,
as represented in the state table, describe the coordinator state
transitions with respect to a single participant.

-----Original Message-----
From: Peter Furniss [mailto:peter.furniss@erebor.co.uk]
Sent: Saturday, May 06, 2006 4:27 AM
To: Ram Jeyaraman; ws-tx@lists.oasis-open.org
Subject: RE: [ws-tx] Issue 052 - WS-AT: Replay message generates
protocol errors 

I think it is likely the state table is being misinterpreted. I'm not
sure by who :-)

If you treat the state as referring to just one participant, you either
get some very convoluted definitions of the internal events (c.f. issue
048 - but more convoluted that the ones proposed there) or you violate
atomicity.

Receiving a 'Prepared' message doesn't move the state to PreparedSuccess
- that's done by "Commit Decision", and until then 'Replay' would cause
an abort. You could define "Commit Decision" as meaning "receipt of ok
vote for just this one participant", and take the state for this
participant to PreparedSuccess. But the only way to leave
PreparedSuccess is from "WriteDone" or "WriteFailed". Since a 'Aborted'
from another participant should certainly cause this participant to be
rolled back, that 'Aborted' will have to trigger "WriteFailed", which is
not an obvious interpretation.


But I think this issue, with 053 (eliminate Replay) is more about
whether Replay need ever force an abort. We may be looking at a
carry-over from connection-centric protocols, where it made sense to
force an abort if the connection broke before commit-time. In those
worlds (more or less all transaction protocols that weren't using xml
and/or web-services, I think), receipt of a recovery message before the
connection was observed to break could only mean the connection break
was about to happen. But with WS-AT (especially because we have said all
messages go on the underlying request) there is no connection to be
monitored anyway. The coordinator hasn't noticed that participant was
out of communication for a while, and now the participant says it is
ready for the commit. Why *require* the coordinator to abort ?

Of course that's not to say the coordinator cannot *choose* to abort by
implementation option if replay is received (or any other circumstance
that leads the coordinator to suspect a failure somewhere). It can
always do that if it hasn't progressed too far - it would appear in the
tables as a User Rollback or Write Failed. 


Peter

-----Original Message-----
From: Ram Jeyaraman [mailto:Ram.Jeyaraman@microsoft.com]
Sent: 06 May 2006 02:09
To: ws-tx@lists.oasis-open.org
Subject: RE: [ws-tx] Issue 052 - WS-AT: Replay message generates
protocol errors 

Section 10 (AT specification) states "These tables present the view of a
coordinator or participant with respect to a single partner".  Thus, the
coordinator states correspond to interactions with a single participant.

The receipt of a participant vote "PreparedSuccess" triggers the
coordinator state to "PreparedSuccess" with respect to that particular
participant, even though the coordinator may not have completed the
prepare phase for the rest of the participants.

Is it possible that the state table is likely being misinterpreted?

-----Original Message-----
From: Ram Jeyaraman [mailto:Ram.Jeyaraman@microsoft.com]
Sent: Thursday, April 06, 2006 10:50 AM
To: ws-tx@lists.oasis-open.org
Subject: [ws-tx] Issue 052 - WS-AT: Replay message generates protocol
errors 

This is identified as WS-TX issue 052.

Please ensure follow-ups have a subject line starting "Issue 052 -
WS-AT: Replay message generates protocol errors ".

-----Original Message-----
From: Alastair Green [mailto:alastair.green@choreology.com]
Sent: Wednesday, April 05, 2006 5:07 PM
To: ws-tx@lists.oasis-open.org
Subject: [ws-tx] New Issue: WS-AT: Replay message generates protocol
errors 

Issue name -- WS-AT: Replay message generates protocol errors

PLEASE DO NOT REPLY TO THIS EMAIL OR START A DISCUSSISON THREAD UNTIL
THE ISSUE IS ASSIGNED A NUMBER.

The issues coordinators will notify the list when that has occurred.

Target document and draft:

Protocol:  WS-AT

Artifact:  spec

Draft:

WS-AT CD 0.1 uploaded

Link to the document referenced:

http://www.oasis-open.org/apps/org/workgroup/ws-tx/download.php/17325/ws
tx-wsat-1.1-spec-cd-01.pdf

Section and PDF line number:

Coordinator View State Table, after l. 503


Issue type:

Design


Related issues:

New issue: WS-AT: Eliminate Replay message. 
New issue: WS-AT: Is logging mandatory?


Issue Description:

Replay reactions defined in current CV state table will cause
unnecessary transaction aborts.
 

Issue Details:

The cells in row (Inbound Messages) Replay, columns (States) Active and
Preparing read:

Active: Send Rollback --> Aborting
Preparing: Send Rollback --> Aborting

Replay message means: "play it again Sam", not "demolish the piano".

Case A. If the last thing they sent was Prepared, and it got through
(we're Preparing and we've recorded their vote), and they've recovered,
and they're waiting for a Commit or a Rollback, then we need to Ignore
the Replay (just like if they send it when we've done our own
housekeeping, and moved to Prepared Success).

Case B. If the message didn't get through, and we've processed User
Commit then we could be in the Preparing state, but have no record of
their vote. In that case we'd have to replay Prepare to indicate to
them, send us your vote again.

Case C. If the last thing we received was Register, and we haven't
processed User Commit, then we're still Active and they won't have
logged. Replay won't happen on crash recovery (no log record to recover
off), but it could be used to say to the coordinator "Are you still
there? Should I crap out?" (i.e., because of impatience). We can't stop
them using Replay in that fashion. Our only sensible response would have

to be: silence (we don't have a blank ack to a ping), i.e. to Ignore. 
There is no harm in them doing this, even though it is pointless. You
could argue that this should be a N/A but that seems heavy-handed.


Proposed Resolution:

As the state tables do not differentiate between Preparing/no vote
recorded and Preparing/vote recorded, it seems easiest to always resend
Prepare in the Preparing state. Therefore:

Replace the current text in the cells in row (Inbound Messages) Replay,
columns (States) Active and Preparing with:

Active: Ignore --> Active
Preparing: Resend Prepare --> Preparing


  


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]