OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

ws-tx message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [ws-tx] RE: Issue 048 (was RE: [ws-tx] Issue 052 - WS-AT: Replaymessage generates protocol errors )


+1

Alastair Green wrote:
> Ram, Peter:
>
> The comment:
> "Clearly, there are assumptions and internal events that are not currently captured in the state tables, because they are hard to express in a state table."
>   
> is an argument for finding a way of stating the assumptions and 
> missing internal events. That is why this issue was raised: to propose 
> that the spec contain definitions of all the events and actions used 
> in the state tables. A combination of such definitions and (possibly 
> modified) state tables should be able to produce unambiguous and 
> complete specification of behaviours.
>
> At the inaugural meeting of the TC the input authors clearly stated 
> that the tables were intended to be normative. If they have holes or 
> omissions, relevant to the design intent, then it is our job to fill 
> in the gaps. If the tables are illustrative and incomplete then 
> implementing the specifications becomes impossible without "insider 
> information" (unwritten rules and knowledge residing somewhere in or 
> between the original author companies). And there is no guarantee that 
> one company's inside understanding is in fact the same as another's.
>
> The difference of interpretation relating to "lazy log delete" exposed 
> in the discussion on Unknown Transaction/None is a classic example of 
> what happens when terms are not defined, and meanings are assumed.
>
> Defining the actions and events is a good starting point for 
> establishing whether and where new rows or columns need to be added 
> (or indeed, old ones subtracted).
>
> Beyond these general points, I agree with Peter that the 
> volatile/durable area is particularly underspecified.
>
> Alastair
>
> Peter Furniss wrote:
>> Ram,
>>
>> Are you sure about the sentence "The state table attempts to portray the
>> high-level flow and is illustrative."?  Surely you aren't saying the
>> state tables are not normative, but are just examples like
>> WS-Coordination section 3 ?
>>
>> State tables are a powerful means of expressing normative requirements
>> in a protocol like WS-AT. Obviously they need to be bug free. (other
>> formalisms may be more powerful still, but tend not to be generally
>> intelligible - or perhaps I just like state tables :-)  It would seem a
>> shame to drop them to an informative level. Even then it would be
>> essential to sort out and make clear what the states and events
>> represent, and what the coverage is (e.g. are they meant to cover the
>> interactions of the different protocols in ws-at)
>>
>> Contrary to your first sentence, I don't think the assumption and
>> internal events are necessarily hard to express in a state table, but
>> the table does have to be structured right. I still think this is most
>> easily done by separating the "B-coordinator" (cooordinator view over
>> all participants in one transaction) and the "C-coordinator"
>> (relationship to one participant) states, as in the proposal I put in on
>> issue 039.  It may be possible to rework the current coordinator table,
>> changing entries and adding new states and events as needed, but I
>> suspect the event definitions will end up more complicated and the whole
>> harder to understand.  But the present tables are, I think we agree,
>> need changing to some extent at least.
>>
>>
>> Peter
>>
>>
>>
>> -----Original Message-----
>> From: Ram Jeyaraman [mailto:Ram.Jeyaraman@microsoft.com] 
>> Sent: 19 May 2006 02:50
>> To: Peter Furniss; ws-tx@lists.oasis-open.org
>> Subject: RE: [ws-tx] RE: Issue 048 (was RE: [ws-tx] Issue 052 - WS-AT:
>> Replay message generates protocol errors )
>>
>> Peter,
>>
>> Clearly, there are assumptions and internal events that are not
>> currently captured in the state tables, because they are hard to express
>> in a state table. The state table attempts to portray the high-level
>> flow and is illustrative.
>>
>> I believe that the tables adequately describe most of the significant
>> transitions, except perhaps for a few bugs (like the All
>> Forgotten/Active cell) which you pointed out earlier.
>>
>> -----Original Message-----
>> From: Peter Furniss [mailto:peter.furniss@erebor.co.uk]
>> Sent: Thursday, May 18, 2006 1:59 AM
>> To: Ram Jeyaraman; ws-tx@lists.oasis-open.org
>> Subject: [ws-tx] RE: Issue 048 (was RE: [ws-tx] Issue 052 - WS-AT:
>> Replay message generates protocol errors )
>>
>> Ram,
>>
>> Comments interleaved, with prefix PRF: 
>>
>> -----Original Message-----
>> From: Ram Jeyaraman [mailto:Ram.Jeyaraman@microsoft.com]
>> Sent: 18 May 2006 02:14
>> To: Peter Furniss; ws-tx@lists.oasis-open.org
>> Subject: RE: Issue 048 (was RE: [ws-tx] Issue 052 - WS-AT: Replay
>> message generates protocol errors )
>>
>> Peter,
>>
>>   
>>> But that doesn't help sort out (either way) whether it concerns one
>>>     
>> participant, multiple participants that are registered for one
>> transaction, or all the participants that are registered for all
>> transactions. All would be subject to events "local to a site" (though
>> the last would definitely not need to dealt with as such in this spec)
>>
>> One way to think about this is to consider internal events as being
>> generated by a TM core for each transaction, but these events are
>> applied on each individual participant  state machine (on a
>> per-participant basis), as illustrated in the CV and PV state tables.
>>
>> PRF: I think it is likely that was the general meaning - that internal
>> events were "B-coordinator" events. It works for some things, but gets
>> into complications for the separate volatile and durable preparing
>> waves.  Comms Times Out would be individual.
>>
>>   
>>> If the states are independent for each participant, how does the
>>>     
>> coordinator view state for a participant that went readonly ever get out
>> of Active ?
>>
>> Yes, the "All Forgotten" row for "Active" column should be "None"
>> instead of "Active".
>>
>> PRF: no that won't work either - User Commit, which would apply to this
>> machine too, would then send Prepare to that participant.  And if the
>> ReadOnly is in response to prepare, we will send Commit to the
>> participant.
>>
>>
>> Peter
>>
>>
>> -----Original Message-----
>> From: Peter Furniss [mailto:peter.furniss@erebor.co.uk]
>> Sent: Thursday, May 11, 2006 12:14 AM
>> To: Ram Jeyaraman; ws-tx@lists.oasis-open.org
>> Subject: Re: Issue 048 (was RE: [ws-tx] Issue 052 - WS-AT: Replay
>> message generates protocol errors )
>>
>> Ram,
>>
>> (I've changed the subject line, since I don't think this is really about
>> Replay specifically, but more generally about the state tables. I could
>> have made it issue 036 instead.)
>>
>>
>> I'm not certain how to read your first sentence, which is the key one.
>> That something is "local to a site" I take to mean that it is internal
>> to the particular installation of some software, which is certainly
>> true. But that doesn't help sort out (either way) whether it concerns
>> one participant, multiple participants that are registered for one
>> transaction, or all the participants that are registered for all
>> transactions.  All would be subject to events "local to a site" (though
>> the last would definitely not need to dealt with as such in this spec)
>>
>>
>> If the states are independent for each participant, how does the
>> coordinator view state for a participant that went readonly ever get out
>> of Active ? 
>>
>> On your understanding of the state tables, do they mandate the behaviour
>> required in interop scenario 3.1 ? That has a volatile participant going
>> prepared, then a durable sending Aborted, causing the coordinator to
>> sent Rollback to the volatile. What event occurred on the coordinator
>> view for the volatile that caused it to send rollback ?  Why (from the
>> tables) was the Prepare to the durable delayed until Prepared was
>> received from the volatile ?
>>
>> Peter
>>
>> -----Original Message-----
>> From: Ram Jeyaraman [mailto:Ram.Jeyaraman@microsoft.com]
>> Sent: 11 May 2006 03:17
>> To: Peter Furniss; ws-tx@lists.oasis-open.org
>> Subject: RE: [ws-tx] Issue 052 - WS-AT: Replay message generates
>> protocol errors 
>>
>> Peter,
>>
>> Internal events, because they are internal, are local to a site, and do
>> not imply states relating to one or more participants. Inbound events,
>> as represented in the state table, describe the coordinator state
>> transitions with respect to a single participant.
>>
>> -----Original Message-----
>> From: Peter Furniss [mailto:peter.furniss@erebor.co.uk]
>> Sent: Saturday, May 06, 2006 4:27 AM
>> To: Ram Jeyaraman; ws-tx@lists.oasis-open.org
>> Subject: RE: [ws-tx] Issue 052 - WS-AT: Replay message generates
>> protocol errors 
>>
>> I think it is likely the state table is being misinterpreted. I'm not
>> sure by who :-)
>>
>> If you treat the state as referring to just one participant, you either
>> get some very convoluted definitions of the internal events (c.f. issue
>> 048 - but more convoluted that the ones proposed there) or you violate
>> atomicity.
>>
>> Receiving a 'Prepared' message doesn't move the state to PreparedSuccess
>> - that's done by "Commit Decision", and until then 'Replay' would cause
>> an abort. You could define "Commit Decision" as meaning "receipt of ok
>> vote for just this one participant", and take the state for this
>> participant to PreparedSuccess. But the only way to leave
>> PreparedSuccess is from "WriteDone" or "WriteFailed". Since a 'Aborted'
>> from another participant should certainly cause this participant to be
>> rolled back, that 'Aborted' will have to trigger "WriteFailed", which is
>> not an obvious interpretation.
>>
>>
>> But I think this issue, with 053 (eliminate Replay) is more about
>> whether Replay need ever force an abort. We may be looking at a
>> carry-over from connection-centric protocols, where it made sense to
>> force an abort if the connection broke before commit-time. In those
>> worlds (more or less all transaction protocols that weren't using xml
>> and/or web-services, I think), receipt of a recovery message before the
>> connection was observed to break could only mean the connection break
>> was about to happen. But with WS-AT (especially because we have said all
>> messages go on the underlying request) there is no connection to be
>> monitored anyway. The coordinator hasn't noticed that participant was
>> out of communication for a while, and now the participant says it is
>> ready for the commit. Why *require* the coordinator to abort ?
>>
>> Of course that's not to say the coordinator cannot *choose* to abort by
>> implementation option if replay is received (or any other circumstance
>> that leads the coordinator to suspect a failure somewhere). It can
>> always do that if it hasn't progressed too far - it would appear in the
>> tables as a User Rollback or Write Failed. 
>>
>>
>> Peter
>>
>> -----Original Message-----
>> From: Ram Jeyaraman [mailto:Ram.Jeyaraman@microsoft.com]
>> Sent: 06 May 2006 02:09
>> To: ws-tx@lists.oasis-open.org
>> Subject: RE: [ws-tx] Issue 052 - WS-AT: Replay message generates
>> protocol errors 
>>
>> Section 10 (AT specification) states "These tables present the view of a
>> coordinator or participant with respect to a single partner".  Thus, the
>> coordinator states correspond to interactions with a single participant.
>>
>> The receipt of a participant vote "PreparedSuccess" triggers the
>> coordinator state to "PreparedSuccess" with respect to that particular
>> participant, even though the coordinator may not have completed the
>> prepare phase for the rest of the participants.
>>
>> Is it possible that the state table is likely being misinterpreted?
>>
>> -----Original Message-----
>> From: Ram Jeyaraman [mailto:Ram.Jeyaraman@microsoft.com]
>> Sent: Thursday, April 06, 2006 10:50 AM
>> To: ws-tx@lists.oasis-open.org
>> Subject: [ws-tx] Issue 052 - WS-AT: Replay message generates protocol
>> errors 
>>
>> This is identified as WS-TX issue 052.
>>
>> Please ensure follow-ups have a subject line starting "Issue 052 -
>> WS-AT: Replay message generates protocol errors ".
>>
>> -----Original Message-----
>> From: Alastair Green [mailto:alastair.green@choreology.com]
>> Sent: Wednesday, April 05, 2006 5:07 PM
>> To: ws-tx@lists.oasis-open.org
>> Subject: [ws-tx] New Issue: WS-AT: Replay message generates protocol
>> errors 
>>
>> Issue name -- WS-AT: Replay message generates protocol errors
>>
>> PLEASE DO NOT REPLY TO THIS EMAIL OR START A DISCUSSISON THREAD UNTIL
>> THE ISSUE IS ASSIGNED A NUMBER.
>>
>> The issues coordinators will notify the list when that has occurred.
>>
>> Target document and draft:
>>
>> Protocol:  WS-AT
>>
>> Artifact:  spec
>>
>> Draft:
>>
>> WS-AT CD 0.1 uploaded
>>
>> Link to the document referenced:
>>
>> http://www.oasis-open.org/apps/org/workgroup/ws-tx/download.php/17325/ws
>> tx-wsat-1.1-spec-cd-01.pdf
>>
>> Section and PDF line number:
>>
>> Coordinator View State Table, after l. 503
>>
>>
>> Issue type:
>>
>> Design
>>
>>
>> Related issues:
>>
>> New issue: WS-AT: Eliminate Replay message. 
>> New issue: WS-AT: Is logging mandatory?
>>
>>
>> Issue Description:
>>
>> Replay reactions defined in current CV state table will cause
>> unnecessary transaction aborts.
>>  
>>
>> Issue Details:
>>
>> The cells in row (Inbound Messages) Replay, columns (States) Active and
>> Preparing read:
>>
>> Active: Send Rollback --> Aborting
>> Preparing: Send Rollback --> Aborting
>>
>> Replay message means: "play it again Sam", not "demolish the piano".
>>
>> Case A. If the last thing they sent was Prepared, and it got through
>> (we're Preparing and we've recorded their vote), and they've recovered,
>> and they're waiting for a Commit or a Rollback, then we need to Ignore
>> the Replay (just like if they send it when we've done our own
>> housekeeping, and moved to Prepared Success).
>>
>> Case B. If the message didn't get through, and we've processed User
>> Commit then we could be in the Preparing state, but have no record of
>> their vote. In that case we'd have to replay Prepare to indicate to
>> them, send us your vote again.
>>
>> Case C. If the last thing we received was Register, and we haven't
>> processed User Commit, then we're still Active and they won't have
>> logged. Replay won't happen on crash recovery (no log record to recover
>> off), but it could be used to say to the coordinator "Are you still
>> there? Should I crap out?" (i.e., because of impatience). We can't stop
>> them using Replay in that fashion. Our only sensible response would have
>>
>> to be: silence (we don't have a blank ack to a ping), i.e. to Ignore. 
>> There is no harm in them doing this, even though it is pointless. You
>> could argue that this should be a N/A but that seems heavy-handed.
>>
>>
>> Proposed Resolution:
>>
>> As the state tables do not differentiate between Preparing/no vote
>> recorded and Preparing/vote recorded, it seems easiest to always resend
>> Prepare in the Preparing state. Therefore:
>>
>> Replace the current text in the cells in row (Inbound Messages) Replay,
>> columns (States) Active and Preparing with:
>>
>> Active: Ignore --> Active
>> Preparing: Resend Prepare --> Preparing
>>
>>
>>   


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]