[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [ws-tx] RE: Issue 048 (was RE: [ws-tx] Issue 052 - WS-AT: Replaymessage generates protocol errors )
+1 Alastair Green wrote: > Ram, Peter: > > The comment: > "Clearly, there are assumptions and internal events that are not currently captured in the state tables, because they are hard to express in a state table." > > is an argument for finding a way of stating the assumptions and > missing internal events. That is why this issue was raised: to propose > that the spec contain definitions of all the events and actions used > in the state tables. A combination of such definitions and (possibly > modified) state tables should be able to produce unambiguous and > complete specification of behaviours. > > At the inaugural meeting of the TC the input authors clearly stated > that the tables were intended to be normative. If they have holes or > omissions, relevant to the design intent, then it is our job to fill > in the gaps. If the tables are illustrative and incomplete then > implementing the specifications becomes impossible without "insider > information" (unwritten rules and knowledge residing somewhere in or > between the original author companies). And there is no guarantee that > one company's inside understanding is in fact the same as another's. > > The difference of interpretation relating to "lazy log delete" exposed > in the discussion on Unknown Transaction/None is a classic example of > what happens when terms are not defined, and meanings are assumed. > > Defining the actions and events is a good starting point for > establishing whether and where new rows or columns need to be added > (or indeed, old ones subtracted). > > Beyond these general points, I agree with Peter that the > volatile/durable area is particularly underspecified. > > Alastair > > Peter Furniss wrote: >> Ram, >> >> Are you sure about the sentence "The state table attempts to portray the >> high-level flow and is illustrative."? Surely you aren't saying the >> state tables are not normative, but are just examples like >> WS-Coordination section 3 ? >> >> State tables are a powerful means of expressing normative requirements >> in a protocol like WS-AT. Obviously they need to be bug free. (other >> formalisms may be more powerful still, but tend not to be generally >> intelligible - or perhaps I just like state tables :-) It would seem a >> shame to drop them to an informative level. Even then it would be >> essential to sort out and make clear what the states and events >> represent, and what the coverage is (e.g. are they meant to cover the >> interactions of the different protocols in ws-at) >> >> Contrary to your first sentence, I don't think the assumption and >> internal events are necessarily hard to express in a state table, but >> the table does have to be structured right. I still think this is most >> easily done by separating the "B-coordinator" (cooordinator view over >> all participants in one transaction) and the "C-coordinator" >> (relationship to one participant) states, as in the proposal I put in on >> issue 039. It may be possible to rework the current coordinator table, >> changing entries and adding new states and events as needed, but I >> suspect the event definitions will end up more complicated and the whole >> harder to understand. But the present tables are, I think we agree, >> need changing to some extent at least. >> >> >> Peter >> >> >> >> -----Original Message----- >> From: Ram Jeyaraman [mailto:Ram.Jeyaraman@microsoft.com] >> Sent: 19 May 2006 02:50 >> To: Peter Furniss; ws-tx@lists.oasis-open.org >> Subject: RE: [ws-tx] RE: Issue 048 (was RE: [ws-tx] Issue 052 - WS-AT: >> Replay message generates protocol errors ) >> >> Peter, >> >> Clearly, there are assumptions and internal events that are not >> currently captured in the state tables, because they are hard to express >> in a state table. The state table attempts to portray the high-level >> flow and is illustrative. >> >> I believe that the tables adequately describe most of the significant >> transitions, except perhaps for a few bugs (like the All >> Forgotten/Active cell) which you pointed out earlier. >> >> -----Original Message----- >> From: Peter Furniss [mailto:peter.furniss@erebor.co.uk] >> Sent: Thursday, May 18, 2006 1:59 AM >> To: Ram Jeyaraman; ws-tx@lists.oasis-open.org >> Subject: [ws-tx] RE: Issue 048 (was RE: [ws-tx] Issue 052 - WS-AT: >> Replay message generates protocol errors ) >> >> Ram, >> >> Comments interleaved, with prefix PRF: >> >> -----Original Message----- >> From: Ram Jeyaraman [mailto:Ram.Jeyaraman@microsoft.com] >> Sent: 18 May 2006 02:14 >> To: Peter Furniss; ws-tx@lists.oasis-open.org >> Subject: RE: Issue 048 (was RE: [ws-tx] Issue 052 - WS-AT: Replay >> message generates protocol errors ) >> >> Peter, >> >> >>> But that doesn't help sort out (either way) whether it concerns one >>> >> participant, multiple participants that are registered for one >> transaction, or all the participants that are registered for all >> transactions. All would be subject to events "local to a site" (though >> the last would definitely not need to dealt with as such in this spec) >> >> One way to think about this is to consider internal events as being >> generated by a TM core for each transaction, but these events are >> applied on each individual participant state machine (on a >> per-participant basis), as illustrated in the CV and PV state tables. >> >> PRF: I think it is likely that was the general meaning - that internal >> events were "B-coordinator" events. It works for some things, but gets >> into complications for the separate volatile and durable preparing >> waves. Comms Times Out would be individual. >> >> >>> If the states are independent for each participant, how does the >>> >> coordinator view state for a participant that went readonly ever get out >> of Active ? >> >> Yes, the "All Forgotten" row for "Active" column should be "None" >> instead of "Active". >> >> PRF: no that won't work either - User Commit, which would apply to this >> machine too, would then send Prepare to that participant. And if the >> ReadOnly is in response to prepare, we will send Commit to the >> participant. >> >> >> Peter >> >> >> -----Original Message----- >> From: Peter Furniss [mailto:peter.furniss@erebor.co.uk] >> Sent: Thursday, May 11, 2006 12:14 AM >> To: Ram Jeyaraman; ws-tx@lists.oasis-open.org >> Subject: Re: Issue 048 (was RE: [ws-tx] Issue 052 - WS-AT: Replay >> message generates protocol errors ) >> >> Ram, >> >> (I've changed the subject line, since I don't think this is really about >> Replay specifically, but more generally about the state tables. I could >> have made it issue 036 instead.) >> >> >> I'm not certain how to read your first sentence, which is the key one. >> That something is "local to a site" I take to mean that it is internal >> to the particular installation of some software, which is certainly >> true. But that doesn't help sort out (either way) whether it concerns >> one participant, multiple participants that are registered for one >> transaction, or all the participants that are registered for all >> transactions. All would be subject to events "local to a site" (though >> the last would definitely not need to dealt with as such in this spec) >> >> >> If the states are independent for each participant, how does the >> coordinator view state for a participant that went readonly ever get out >> of Active ? >> >> On your understanding of the state tables, do they mandate the behaviour >> required in interop scenario 3.1 ? That has a volatile participant going >> prepared, then a durable sending Aborted, causing the coordinator to >> sent Rollback to the volatile. What event occurred on the coordinator >> view for the volatile that caused it to send rollback ? Why (from the >> tables) was the Prepare to the durable delayed until Prepared was >> received from the volatile ? >> >> Peter >> >> -----Original Message----- >> From: Ram Jeyaraman [mailto:Ram.Jeyaraman@microsoft.com] >> Sent: 11 May 2006 03:17 >> To: Peter Furniss; ws-tx@lists.oasis-open.org >> Subject: RE: [ws-tx] Issue 052 - WS-AT: Replay message generates >> protocol errors >> >> Peter, >> >> Internal events, because they are internal, are local to a site, and do >> not imply states relating to one or more participants. Inbound events, >> as represented in the state table, describe the coordinator state >> transitions with respect to a single participant. >> >> -----Original Message----- >> From: Peter Furniss [mailto:peter.furniss@erebor.co.uk] >> Sent: Saturday, May 06, 2006 4:27 AM >> To: Ram Jeyaraman; ws-tx@lists.oasis-open.org >> Subject: RE: [ws-tx] Issue 052 - WS-AT: Replay message generates >> protocol errors >> >> I think it is likely the state table is being misinterpreted. I'm not >> sure by who :-) >> >> If you treat the state as referring to just one participant, you either >> get some very convoluted definitions of the internal events (c.f. issue >> 048 - but more convoluted that the ones proposed there) or you violate >> atomicity. >> >> Receiving a 'Prepared' message doesn't move the state to PreparedSuccess >> - that's done by "Commit Decision", and until then 'Replay' would cause >> an abort. You could define "Commit Decision" as meaning "receipt of ok >> vote for just this one participant", and take the state for this >> participant to PreparedSuccess. But the only way to leave >> PreparedSuccess is from "WriteDone" or "WriteFailed". Since a 'Aborted' >> from another participant should certainly cause this participant to be >> rolled back, that 'Aborted' will have to trigger "WriteFailed", which is >> not an obvious interpretation. >> >> >> But I think this issue, with 053 (eliminate Replay) is more about >> whether Replay need ever force an abort. We may be looking at a >> carry-over from connection-centric protocols, where it made sense to >> force an abort if the connection broke before commit-time. In those >> worlds (more or less all transaction protocols that weren't using xml >> and/or web-services, I think), receipt of a recovery message before the >> connection was observed to break could only mean the connection break >> was about to happen. But with WS-AT (especially because we have said all >> messages go on the underlying request) there is no connection to be >> monitored anyway. The coordinator hasn't noticed that participant was >> out of communication for a while, and now the participant says it is >> ready for the commit. Why *require* the coordinator to abort ? >> >> Of course that's not to say the coordinator cannot *choose* to abort by >> implementation option if replay is received (or any other circumstance >> that leads the coordinator to suspect a failure somewhere). It can >> always do that if it hasn't progressed too far - it would appear in the >> tables as a User Rollback or Write Failed. >> >> >> Peter >> >> -----Original Message----- >> From: Ram Jeyaraman [mailto:Ram.Jeyaraman@microsoft.com] >> Sent: 06 May 2006 02:09 >> To: ws-tx@lists.oasis-open.org >> Subject: RE: [ws-tx] Issue 052 - WS-AT: Replay message generates >> protocol errors >> >> Section 10 (AT specification) states "These tables present the view of a >> coordinator or participant with respect to a single partner". Thus, the >> coordinator states correspond to interactions with a single participant. >> >> The receipt of a participant vote "PreparedSuccess" triggers the >> coordinator state to "PreparedSuccess" with respect to that particular >> participant, even though the coordinator may not have completed the >> prepare phase for the rest of the participants. >> >> Is it possible that the state table is likely being misinterpreted? >> >> -----Original Message----- >> From: Ram Jeyaraman [mailto:Ram.Jeyaraman@microsoft.com] >> Sent: Thursday, April 06, 2006 10:50 AM >> To: ws-tx@lists.oasis-open.org >> Subject: [ws-tx] Issue 052 - WS-AT: Replay message generates protocol >> errors >> >> This is identified as WS-TX issue 052. >> >> Please ensure follow-ups have a subject line starting "Issue 052 - >> WS-AT: Replay message generates protocol errors ". >> >> -----Original Message----- >> From: Alastair Green [mailto:alastair.green@choreology.com] >> Sent: Wednesday, April 05, 2006 5:07 PM >> To: ws-tx@lists.oasis-open.org >> Subject: [ws-tx] New Issue: WS-AT: Replay message generates protocol >> errors >> >> Issue name -- WS-AT: Replay message generates protocol errors >> >> PLEASE DO NOT REPLY TO THIS EMAIL OR START A DISCUSSISON THREAD UNTIL >> THE ISSUE IS ASSIGNED A NUMBER. >> >> The issues coordinators will notify the list when that has occurred. >> >> Target document and draft: >> >> Protocol: WS-AT >> >> Artifact: spec >> >> Draft: >> >> WS-AT CD 0.1 uploaded >> >> Link to the document referenced: >> >> http://www.oasis-open.org/apps/org/workgroup/ws-tx/download.php/17325/ws >> tx-wsat-1.1-spec-cd-01.pdf >> >> Section and PDF line number: >> >> Coordinator View State Table, after l. 503 >> >> >> Issue type: >> >> Design >> >> >> Related issues: >> >> New issue: WS-AT: Eliminate Replay message. >> New issue: WS-AT: Is logging mandatory? >> >> >> Issue Description: >> >> Replay reactions defined in current CV state table will cause >> unnecessary transaction aborts. >> >> >> Issue Details: >> >> The cells in row (Inbound Messages) Replay, columns (States) Active and >> Preparing read: >> >> Active: Send Rollback --> Aborting >> Preparing: Send Rollback --> Aborting >> >> Replay message means: "play it again Sam", not "demolish the piano". >> >> Case A. If the last thing they sent was Prepared, and it got through >> (we're Preparing and we've recorded their vote), and they've recovered, >> and they're waiting for a Commit or a Rollback, then we need to Ignore >> the Replay (just like if they send it when we've done our own >> housekeeping, and moved to Prepared Success). >> >> Case B. If the message didn't get through, and we've processed User >> Commit then we could be in the Preparing state, but have no record of >> their vote. In that case we'd have to replay Prepare to indicate to >> them, send us your vote again. >> >> Case C. If the last thing we received was Register, and we haven't >> processed User Commit, then we're still Active and they won't have >> logged. Replay won't happen on crash recovery (no log record to recover >> off), but it could be used to say to the coordinator "Are you still >> there? Should I crap out?" (i.e., because of impatience). We can't stop >> them using Replay in that fashion. Our only sensible response would have >> >> to be: silence (we don't have a blank ack to a ping), i.e. to Ignore. >> There is no harm in them doing this, even though it is pointless. You >> could argue that this should be a N/A but that seems heavy-handed. >> >> >> Proposed Resolution: >> >> As the state tables do not differentiate between Preparing/no vote >> recorded and Preparing/vote recorded, it seems easiest to always resend >> Prepare in the Preparing state. Therefore: >> >> Replace the current text in the cells in row (Inbound Messages) Replay, >> columns (States) Active and Preparing with: >> >> Active: Ignore --> Active >> Preparing: Resend Prepare --> Preparing >> >> >>
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]