[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [ws-tx] Issue 007 - WS-C: Make Register/RegisterResponse retriable
Why don't we go easy on the the change-phobia and speed-mania? The
proposers
of these issues and everyone else on the TC have worked very hard to
discuss them in an extraordinarily short space of time (one week). The
spec change to introduce participant identifiers is trivial, and does
not mandate anyone to change their current implementation approach. There seems broad agreement that Fault/AlreadyRegistered has to go (to AT); then there is nothing to stop retries of Register (in fact, they must be assumed to occur). This issue 007 could be mentally relabelled: "Enable detection of duplicate Participant registrations". But it cannot be sensibly answered without considering issue 014 simultaneously, as the proposed solution for 014 (add participant identifiers), perforce resolves 007. So, introducing participant identifiers will resolve both Issue 007 and Issue 014. It is a more elegant approach that does not create forced implementation choices that are required to work around its absence for the duplicate registration problem. It is the only solution that will make BA MixedOutcome or BA Participant Completion registration workable (at least, the only one thus far proposed). How hard is this change? The following mods completely cover it: a) Add to the pseudo-schema infoset description of Register the following text after line 320: "<Identifier> ... </Identifier>" b) Add to the infoset description of Register the following text after line 326: "/Register/Identifier An IRI, which identifies the registering Participant for the coordination protocol specified by the element /Register/ProtocolIdentifier." c) Add to the example XML document for Register the following text after line 335: "<Identifier> http://www.fabrikam.com/P/564789 </Identifier>" d) If you want to be really long-winded, you can add the following text after line 315, or possibly after line 346: "[new paragraph]A RegistrationService that receives multiple Register messages for the same coordination protocol with identical /Identifier values MAY treat these messages as duplicates, i.e. it MAY send the same RegisterResponse message to the registrant in reply to each of them." e) the consequential schema change, introduce after line 83 (I presume you can say anyIRI instead of anyURI -- XML experts help please). <xsd:element name="Identifier"> <xsd:complexType> <xsd:simpleContent> <xsd:extension base="xsd:anyIRI"> <xsd:anyAttribute namespace="##other"/> </xsd:extension> </xsd:simpleContent> </xsd:complexType> </xsd:element> That's it. I don't think anyone has proposed that a Coordinator must detect duplicate messages, nor should we demand this. What I want to see is the ability to do so. If you choose to ignore the participant identifier field in your implementation then that's your right. Your duplicate-ignorant product will continue to do what it does at the moment. It will interoperate with other duplicate-recognizing products for registration purposes. It would be very helpful if someone who thinks "no design change" will work here could answer the following four questions: 1. Do you propose that implementations must be forced to re-use old EPRs when replacement or recovery participant registration occurs?I'm always up for being proved wrong, and if there are easier or cleaner ways of resolving the issues, I'd like to hear them. Alastair Ian Robinson wrote: This topic is not unique to WS-C or indeed to connectionless transports. In the CORBA Transaction and Activity services the requester is responsible for either ensuring it registers only once for a single set of protocol messages. While the conection-based CORBA architecture eliminated the concern over retrying the equivalent Register operations, there have always been other "duplicate" registration scenarios in which an application may forget its registration for one reason or another and register a second time (the so-called "amnesia" scenario). OTS has no normative statement about applications registering multiple times although it places no burden on the Coordinator to detect this primarily because, just like with EPRs, there is no simple client-side (coordinator-side) mechanism for comparing IORs. The Activity service went further than OTS and made the explicit statement that multiple registrations result in multiple sets of protocols messages (section 2.2.5, add_action). This latter approach is also assumed by the WS-Tx specifications today. To reiterate earlier comments ([1], [2]) I recommend we resolve this issue by making an explicit statement that Coordinator implementations are not required to detect duplicate Register messages. [1] proposes some precise text. [1] http://www.oasis-open.org/apps/org/workgroup/ws-tx/email/archives/200512/msg00163.html [2] http://www.oasis-open.org/apps/org/workgroup/ws-tx/email/archives/200512/msg00176.html Regards, Ian Robinson STSM, WebSphere Messaging and Transactions Architect IBM Hursley Lab, UK ian_robinson@uk.ibm.com Mark Little <mark.little@jbos s.com> To Peter Furniss 16/12/2005 04:12 <peter.furniss@choreology.com> cc "'Max Feingold'" <max.feingold@microsoft.com>, ws-tx@lists.oasis-open.org Subject Re: [ws-tx] Issue 007 - WS-C: Make Register/RegisterResponse retriable Peter Furniss wrote:I think we may need to go back to the original, architectural question here, then revisit the possible mechanisms once we've decided what should be. The original question is really in the issue title - phrasing it as a question: should the Register/RegisterResponse exchange be retriable ? That is to say, should it be possible to resend Register on behalf of a particular Participant (that is a single state entity responsible for some part of the work of a transaction) if the RegisterResponse is not received, with the eventual result being the same [for some sense of "same"] as if the original RegisterResponse had been received. A key element of that is that we are discussing the registration of a single participant state entity, and not cases where a service deliberately wishes to register multiple participants. The possibility of a retriable register has scarcely come up in earlier protocols because they mostly assumed a connection-based carrier or transport, and were expected to abort a transaction if the connection failed before the commit decision. (I believe all of LU 6.2 SYNCPT, OSI CCR/TP, OTS, TIP, OLEtx are expected to be "vulnerable" to connection failure - I'm open to correction on some of those).+1A feature of being connection-based is that it can be assumed a comms failure on one side will be (sooner or later) reflected as a failure on the other side. Thus if register (or equivalent) is sent to a coordinator and no reply comes back there is no point in sending another one - either the first one got there, and the reply will come back, or the connection broke and that is sufficient to cause the transaction to be aborted anyway.Well there are always exceptions. For example, TRANSIENT in CORBA sys-exs could be used to retry later. However, I agree with the general point you're making.But the WS-Tx family aren't connection-based in the same way. WS-AT explicitly states (ws-at, line 454) that different connections are used for messages in each direction. So there isn't a single connection that everything can pass on, and indeed every message could pass on a different connection. (further, given the generality of SOAP mappings, there is no reason to assume that a connection failure is even detectable by the sender, though in the case of http it would be). The WS-BA case is even stronger, albeit not in normative text (ws-ba, lines 34-40).+1Now you *could* perhaps make a WS-AT implementation that was vulnerable to connection failure, though I'm not sure it gives any simplification in the implementation. Interoperation requires that you can cope with new incoming connections, and ignore the closing of old ones. You could monitor your own outbound connections and treat an unintended disconnect as a reason to abort (if you are still allowed to, of course) but it would seem to be perverse. (I think the only circumstance in which this is allowed is if coordinator sees a comms failure after sending Prepare - all other messages after the registration are either recoverable or part of aborting anyway) It is against this background that we should consider whether WS-C should make Register/RegisterResponse retriable. To say that it shouldn't would seem to be deliberatly importing connection-failure vulnerability to an environment where it is no longer needed. Of course, that's not to say a WS-AT participant-side would be required to resend Register if it doesn't get a RegisterResponse back (on a different connection) or doesn't see an http 200 on the outbound connection. It is impossible to require self-initiated behaviour of a non-persisted entity (because it can just say "oh, I crashed, didn't you notice ?"). ------------------------ So, back to mechanisms. Can we make Register retriable at reasonable cost ? I have yet to see any argument against putting a participant identifier on the Register. Although the coordinator is not allowed test EPRs for equality, the participant must always be able to extract and combine various fields that will make an unambiguous identifier (it can do it other ways too, but the EPR must always contain at least sufficient - the problem that disallows others to test is that it may contain other stuff)A failure to receive a register response could trigger a completely new register message with a new EPR (on the assumption a retry of the first attempt caused the already-registered fault to be returned). The only problem I can see at present with this mechanism is that manufacturing a new EPR for the "same" participant may not be feasible in some environments. However, that could be seen as an implementation problem. The advantage would be that no changes to the specification are required - other than a clarification of the text to call out this possibility.The alternative of trying to make multiple registrations for what is in fact the same participant work would seem to cause considerable complications. For atomic cases, the coordinator may not mind - it just sees two (or more) registrations and they must both be committed (or rolledback). But Max's"The participantsimply needs to behave correctly[1] by distinguishing its multiple enlistments.is very questionable, because it will receive two Prepare's (say), both delivered to the same EPR, but must reply to different coordinator endpoints, one given on the succesful RegisterResponse, one on the lost one. As in Alastair's diagrams sent earlier today, it would have to use the Reply-To EPR (in which case, why not use that anyway and get rid of the RegisterResponse altogether) [this is completely impossible for coordination protocols where the first message is participant to coordinator - see Alastair's diagram 3]I agree all of this is possible and may be sub-optimal in certain degenerate situations. However, when weighed against the timeline imposed for getting WS-C through to standardisation, it may be that the "do nothing" approach I mentioned above is the best option.Gosh, this has ended up rather long (and will probably now cross with other messages saying the same thing or rendering it out of date)To be honest I don't have a hard stance on any solutions to this issue at the moment. My only concern is time spent so far and the fact that there are other issues to work through that may be equally, or more, contentious. I hope we can bring this to a conclusion (a vote) soon. Mark.Peter-----Original Message----- From: Mark Little [mailto:mark.little@jboss.com] Sent: 15 December 2005 16:50 To: Max Feingold Cc: ws-tx@lists.oasis-open.org Subject: Re: [ws-tx] Issue 007 - WS-C: Make Register/RegisterResponse retriable Colleen Evans wrote:Forwarding for Max. Colleen -----Original Message----- From: Max Feingold Sent: Wednesday, December 14, 2005 6:22 PM To: ws-tx@lists.oasis-open.org Subject: RE: [ws-tx] Issue 007 - WS-C: Make Register/RegisterResponseretriable After digesting this week's discussion on this topic, I have a few observations: - I cannot think of a protocol that requires idempotentregistration.WS-AT certainly does not need it. A conformant WS-ATparticipant mightonly send a single Register message and report failure to the registrant if a timeout or comms failure occurs. It can also send multiple Register messages. The coordinator will create a new participant enlistment for each Register it receives. Theparticipantsimply needs to behave correctly[1] by distinguishing its multiple enlistments.You're right. OTS, for example, doesn't place a restriction on participant registration either. Speaking purely about transactions, with the exception of the Activity Service, I'm not sure of a protocol that prevents multiple registrations. If we say that the operation isn't idempotent, then we definitely need to remove the fault message though, in case retransmissions are attempted for failure situations.- One can imagine that different coordination types might have different expectations for what it means to send multiple(duplicate)Register messages. Given that Register is scoped to a specific coordination type at both the participant and the coordinator, it is not clear that the semantics of Register can (or should) beuniversallyconstrained to a single pattern.I'm happy to punt this up to referencing specifications.- A hypothetical coordination protocol that wants to detectduplicatesshould not overload existing WS-Addressing mechanisms. Instead, it should use WS-C extensibility to create specific participant identifiers. Some protocols may need this functionality for correctness. I do not believe that any of the WS-Txprotocols need it.If we don't try to detect duplicate enlistments at this level, then we also get round the need for EPR comparisons and don't need to add a new participant URI to register either. Mark.... [1] "Correctly" here means not splitting the transactiontree. I canfollow up on this if people are interested in the details. -----Original Message----- From: Peter Furniss [mailto:peter.furniss@choreology.com] Sent: Friday, December 09, 2005 9:54 AM To: ws-tx@lists.oasis-open.org Subject: [ws-tx] Issue 007 - WS-C: Make Register/RegisterResponse retriable This is hereby declared to be ws-tx Issue 007. Please follow-up to this message or ensure the subject line starts Issue 007 - (ignoring Re:, [ws-tx] etc) The Related Issues list has been updated to show the issue numbers. Issue name -- WS-C: Make Register/RegisterResponse retriable Owner: Alastair Green [mailto:alastair.green@choreology.com] Target document and draft: Protocol: Coord Artifact: spec Draft: Coord spec working draft uploaded 2005-12-02 Link to the document referenced: http://www.oasis-open.org/committees/download.php/15738/WS-Coordination- 2005-11-22.pdf Section and PDF line number: WS-Coordination spec, Section 3.2 "Registration Service" l. 294 Issue type: Design Related issues: Issue 008 - WS-C: Remove fault 4.6 AlreadyRegistered Issue 014 - WS-C: EPR equality comparison is problematic Issue 009 -WS-C/WS-AT: Is request-reply MEP useful? Issue Description: Register/RegisterResponse should be retriable exchange Issue Details: [This issue stems from Choreology Contribution issue TX-20.] Section 9 of WS-AT defines the WS-Coordination exchanges CreateCoordinationContext/CreateCoordinationContextResponse Register/RegisterResponse as request-reply exchanges. (Whether this request reply MEP should be used at all in the WS-TX specs is addressed in a separate issue: see "Issue 009 -WS-C/WS-AT:Is request-reply MEP useful?".) Substantively, it may be particularly misleading to think of the Register/RegisterResponse exchange as a request-reply pattern. The implication of using this pattern is that there is a simple one message in, one message out exchange. The presence of a fault (AlreadyRegistered) as a potential response to Register hardens that implication. Current behaviour would lead to service being informed ithas alreadyregistered a Participant, when it has in fact simply succeeded in registering a Participant. Superficially, the AlreadyRegistered faultcould simply be viewed as being unnecessarily verbose: the reaction of theservice tothe fault at run-time must be to treat it as uninteresting, i.e. as equal in effect to a successful registration. In fact there is a deeper problem. Consider the following scenario: A Coordination Service (CS) creates a Coordinator (C) for anew atomictransaction (AT), and emits a CoordinationContext (CC). The CC is transmitted to an application service (AS). AS (logically) creates a P which sends Register (R) to the Registration Service (RS) EPR for AT, embedding the EPR for receipt of protocol messages outbound from C to P (CP EPR). The RS, on receiving Register, creates an EPR for inbound protocol messages from P to C (PC EPR), and embeds this in theRegisterResponse(RR), which it sends to P. AS and P crash before the RR message is received by P, or the RR message drops and is never received by P. Either way, AS (onrecovery,or after waiting) causes P to resends R to RS. RS examines the inbound Register, and determines that it has come from a known P (see "RelatedIssues","WS-C: EPR equality comparison should not be relied upon"), i.e. that it is a duplicate registration. Currently, RS replies with an AlreadyRegistered fault, sentto P. P nowknows that he is registered with C, but has never receivedthe PC EPR(/RegisterResponse/CoordinationProtocolService element). Any further retries of P send R to C will result in the same situation. C will never be able to receive messages from P. P will never become Prepared. The transaction will eventually collapse through timeout. Therefore, the Register/RegisterResponse exchange must tolerate duplicates. If a Register message is delivered more than once (either by the transport, or through comms-failure- orrecovery-induced retry)then the Registration Service should respond on each occasion with a RegisterResponse containing the same PC EPR, to ensure reliable completion of the EPR exchange that permits the subsequent coordination protocol to operate correctly. NOTE. This change brings the R/RR exchange in line with thebehaviour of theCreateCoordinationContext/...Response exchange. There is a difference. R/RR is likely to beimplemented as atrue idempotent operation. CCC/CCCR is not: each CCCR embeds a new RS EPR, and a new /Context/Identifier. But each exchange can be harmlessly replayed indefinitely, in the event of failure to receive the response message. Proposed Resolution: Insert the following text in WS-Coordination spec, Section 3.2 "Registration Service" immediately following current l. 294 "[New paragraph]The requester MAY send a Register message for a givenParticipant more than once, and the underlying transportcould deliverthe Register message more than once. On receipt of a Register message for a given Participant, which has already been processed succesfully, the Registration Service MUST send to the requester a RegisterResponse containing the same CoordinationProtocolService element (Endpoint Reference for Participant to Coordinator protocol messages) as that contained in all previous RegisterResponses generated by the Registration Service which relate to the Participant'srequest toregister for this activity. [New paragraph]" |
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]