ws-tx message

Subject: Editorial problem with 007 resolution (WAS: Re: [ws-tx] Re: Editorializingon MEPs etc)

From: Alastair Green <alastair.green@choreology.com>
To: Andrew Wilkinson3 <awilkinson@uk.ibm.com>
Date: Mon, 27 Mar 2006 16:51:30 +0100

I am returning to this discussion as a result of considering the [reply endpoint] issue, which has led me to question the wording of the resolution of 007 (participant ids).

In Andrew's example, Register messages for different participants specify a single, identical [reply endpoint]. I think of this endpoint as the "participant registration gateway" (PRG) for a service which may register numerous participants.

If a service repeats (duplicates) a Register message, e.g. in the event of a lost inbound RegisterResponse message, then it will send two messages, each with its own distinct WS-A message id. The Coordinator, in conformance with the resolution to 007, is expected to treat each Register as an independent, fresh message, and to create a coordinator view state machine for each one. It will send two RegisterResponses (the first, lost one, and then a second one), both returning to the same "place" (the PRG). The PRG needs to figure out that the second response is in fact a duplicate of the first one, so that it can create a logical read-only pseudo-participant to handle (effectively discard) subsequent protocol messages (or do rerouting, which amounts to the same thing).

How will the PGR determine that the second RegisterResponse has a relationship to the first one?

Given that we are using message ids, it is simple for the PRG to keep track of the message ids that it generated for a given participant. Internally correlating by recording the relationship of the two (or more) generated message ids is one strategy a registering service can easily adopt.

However, the relevant 007 resolution text in WS-Coordination reads:

If a participant sends multiple Register requests for the same activity, the participant MUST be prepared to correctly handle duplicate protocol messages from the coordinator. One simple strategy for accomplishing this is for the participant to generate a unique reference parameter for each participant Endpoint Reference that it provides in a Register request [my emphasis -- AG]. The manner in which the participant handles duplicate protocol messages depends on the specific coordination type and coordination protocol.

Two problems that I see with this. The message id approach is an alternative, but is not mentioned. Not a big deal -- the spec does say that unique ref params is "one simple strategy". A little more substantively: we need to be precise about which participant EPR we're talking about.

It therefore appears that the resolution to 007, to be rendered correctly, requires that the sentence I have italicized should read:

One simple strategy for accomplishing this is for the sender of Register to ensure that the the WS-Addressing [reply endpoint] Endpoint Reference that it provides in a Register request identifies the participant unambiguously from the sender's standpoint.

Otherwise one might think that the EPR referred to is the /Register/ParticipantProtocolService one -- and that won't do the trick.

And yes, I will raise an issue.

Alastair

Andrew Wilkinson3 wrote:

Alastair,

In my example the EPR would not be participant-specific, instead the 
implementation had chosen to rely upon the WSA message id for correlation. 
I don't agree that this is analogous to a universally unique participant 
id as this usage of the message id to identify the participant is entirely 
internal to the participant implementation. That said, I do take your 
general point.

I would find it hard to argue with your view that the message id and the 
RR MEP are not doing a great deal for us, however there is a lot to be 
said for not unnecessarily perturbing the specs - WS-C's use of the RR MEP 
for register / registerResponse works fine as it is.

Andy




Alastair Green <alastair.green@choreology.com> 
02/03/2006 16:49

To
Andrew Wilkinson3/UK/IBM@IBMGB
cc
ws-tx@lists.oasis-open.org
Subject
[ws-tx] Re: Editorializing on MEPs etc






Andrew,

[I'm going to put this exchange out onto the main list, because I think we 
may need some guidance.]

Aha, a penny has dropped. 

Are you are thinking of a situation where a set of RegisterResponses are 
coming back to the same ReplyTo EPR (acting as a reply gateway for several 
participants)? I have been assuming that we ask for a reply to e.g. the 
ParticipantProtocolService EPR (or some EPR that maps it one-to-one). i.e 
that the EPR is participant-specific.

In other words, in your view, the message id is a kind of explicit 
universally unique participant id :-)    .

If the reply EPR is per-participant then correlation occurs by virtue of 
delivering RR to the per-participant EPR, and message id is irrelevant.

If the reply EPR is not, then we can't ignore (or dispense with) the 
message id. 

If we expect the reply EPR to incorporate enough information to lead us to 
the participant behind the scenes, then I don't see what the message id is 
doing for us. Explicit participant ids are unpopular in this committee, 
but in this case you'd like to keep one as an alternative means of 
correlation to the use of EPRs?

If that is the case, then you are right: we cannot make the MEP a free 
choice.

Either way, we may need some kind of additional spec statement. 

The rule in my mind has been: that the reply-to EPR supplied in the header 
of Register will allow the reply (RegisterResponse) to be unambigously 
identified with/correlated with the Participant, as defined or identified 
by the value of ParticipantProtocolService EPR

The rule in the existing spec's mind, as it were, is that the combination 
of reply-to EPR and Register message id is sufficient to allow the 
recipient of RR to correlate it with the value of the 
ParticipantProtocolService EPR.

By the way, this discussion is forcing me to read WS-Addressing with ever 
greater care and attention. Two points:

1) I think that we have to obey a reply-to EPR if we given one. This 
directly relates to the definition of the terminal and non-terminal 
messages, wihich currently (WS-AT ll. 445-446), says that the use of the 
reply-to EPR is optional.

2) I also think that the WS-A spec is ambiguous (at least) on the 
following point: it could be read to say -- if you are sending a reply you 
must state the relationship of this response message to the stimulant 
message, i.e. that you have to use relates-to. I'm not sure that's what it 
really means to say, but it does imply it quite strongly. This stands at 
odds with the current WS-A Section 9 rules for non-terminal messages.

Yours,

Alastair


Andrew Wilkinson3 wrote: 
Alastair,

I don't believe we can make use the RR MEP optional for the 
register/registerResponse exchange as it would bring with it unwanted 
interop complications.

Observing that the only real difference between the RR and one-way MEPs is 

the inclusion of a relates to header in a RR response message imagine two 
separate implementations, one which uses the RR MEP for register / 
register response, the other which uses the one-way MEP. The 
implementation using the RR MEP sends a register request and stores the 
WSA uid of the message which it will subsequently use to correlate the 
reply. The implementation using the one-way MEP receives this message and 
replies, the relatesTo header is not included in the message. The register 

response message is received but without a relatesTo entry in the header 
the implementation is unable to correlate it with the register message - 
at this point we're broken. For this reason I believe that we need to make 

a definite statement about the use of the RR MEP and, in the interests of 
not unnecessarily perturbing the specs, that statement should be that the 
register / registerResponse exchange MUST be conducted using the RR MEP.

Andy




Alastair Green <alastair.green@choreology.com> 
01/03/2006 20:42

To
Andrew Wilkinson3/UK/IBM@IBMGB
cc
jharby@gmail.com, Mark Little <mark.little@jboss.com>, Max Feingold 
<Max.Feingold@microsoft.com>, Thomas Freund <tjfreund@us.ibm.com>
Subject
Re: Editorializing on MEPs etc






Andrew,

On the procedural point, as already expressed, I agree.

Your proposal  on 2. was my initial inclination.

I would be more open to it, if we were to make use of RR MEP optional 
for WS-C (i.e. that implementers can choose either one-way or RR MEP for 
the WS-C exchanges). That would be a good move, in my view, as RR MEP is 
a "Habsburg's tail" (a vestigial organ with no current function: the 
extra verterbra that the Habsburg royal family allegedly often 
possessed), and would fully justify the positioning of the 
terminal/non-terminal definition in the base, WS-C, spec.

Alastair

Andrew Wilkinson3 wrote:
 
I think that we should be careful not to exceed our remit when producing 
 

 
text to address issue 9. If the resolution to issue 007 has exposed a 
problem with the WS-AT state tables then I believe it would be 
procedurally correct to raise a seperate issue to address it rather than 
 

 
trying to resolve multiple issues under issue 009.

I would like to suggest an alternative to Alastair's 2. below and that 
 
is 
 
that we produce a set of definitions in WS-C that defines use of the RR 
MEP and the one-way MEP including defintions of terminal and 
 
non-terminal 
 
messages. While WS-C doesn't use the one-way MEP I believe there's some 
value in attempting to produce a list of commonly used MEPs within WS-C 
which can be referenced by other specs. Whether or not we attempt to 
 
make 
 
this list exhaustive is a point for discussion. Again, this is possibly 
something that should be done under a separate issue is it's arguably 
 
not 
 
directly related to the RR MEP - issue 011 may well be more appropriate.

Andy




Alastair Green <alastair.green@choreology.com> 
01/03/2006 16:51

To
Mark Little <mark.little@jboss.com>
cc
Thomas Freund <tjfreund@us.ibm.com>, Andrew Wilkinson3/UK/IBM@IBMGB, 
jharby@gmail.com, Max Feingold <Max.Feingold@microsoft.com>
Subject
Re: Editorializing on MEPs etc






Hi Mark,

Sorry, I don't think we can ignore the the duplicate RegisterResponse 
issue or hope it will be dealt with at the infrastructure level without 
 
a 
 
bit of extra specification, in WS-AT and WS-BA.

To recap: duplicate Registers are deemed to be OK by resolution of 007: 
the Coordinator generates a new EPR for the deemed "new participant".

Duplicate registers can arise either by impatient retry, or by transport 
 

 
redelivery. The ensuing RegisterResponses will both be delivered to the 
same EPR, so the receiving end can work out that it's received one twice 
 

 
(ignore the second one).

The rule is: if an RR message is received twice targeted on the same EPR 
 

 
then it has to be thrown away. This is the same kind of rule that is 
expressed in the WS-AT state tables for e.g. duplicate Prepares. Not 
 
quite 
 
the same -- the action is not to resend a response, but the fact that 
 
this 
 
may happen has to be captured somewhere.

As Max points out, the current PV state table assumes that 
RegisterResponse will arrive once. It doesn't cope with duplicate 
RegisterResponses.

This is only OK if the "throw away" (no-op) rule is stated elsewhere. 

Here are two implementaton strategies that might be adopted:

A. Set a participant state machine to a state of "initial" or 
"registering", and send Register to C. Keep a vector of all message ids 
for all Registers sent for the current P EPR, with a vector status of 
"live". If a RegisterResponse arrives whose reply-to value is equal to 
 
one 
 
of the stored message ids, and the vector is "live" then set the 
participant state machine to "active", mark the vector as "dead". If the 
 

 
RR arrives when the vector is "dead" then ignore the inbound message 
(no-op). [This is very artificial: I am trying to imagine why and how 
 
you 
 
would actually use the values of message id and reply to.]

B. Set a participant state machine to a state of "initial" or 
"registering" and send Register to C. If a RegisterResponse arrives at 
 
the 
 
current P EPR, and the state machine is in state "registering" set the 
state machine state to "active", and proceed. If an RR arrives when the 
state machine is "active" then ignore the inbound message (no-op).

Logically, these are the same state machine. In the first case we have 
created an ancillary mini-machine that uses the Request-Reply MEP 
features. In the second case the implementation state machine is a 
 
direct 
 
reflection of the logical state machine (that does not use the RR MEP 
features). .

In my view the specification describe the logical state machine, and 
should leave the implementation strategy to the implementer (especially 
 
as 
 
implementation strategy A is so unnatural). 

Note that this problem is created by the fact that we are potentially 
processing a sequence of messages, each with its own message id. There 
 
is 
 
no concept in WS-A of such a sequence. Therefore, we need to say -- 
 
here, 
 
in these specs -- that such a sequence can exist, and how to deal with 
 
it. 
 
Otherwise it becomes one of those cases where "we all know what we meant 
 

 
to say", which is not a good practice. Right now, if you look at the row 
 

 
RegisterResponse, column Active, in the 2PC PV of WS-AT you will read 
 
the 
 
following: Invalid State/Active. And according to the text immediately 
above, Invalid State means: "send an Invalid State fault" -- which is 
 
not 
 
what we want.

Either we change the state table, or we write text enforcing an approach 
 

 
similar to strategy A. On grounds of consistency, minimalism, and 
 
freedom 
 
of implementation choice I would prefer to change the WS-AT state table. 
 

 
It's an unfortunate fact that the RR MEP is not doing anything 
 
fundamental 
 
here except forcing implementers into a particular (unspecified) 
behaviour. As I am tired of fighting City Hall, I don't mind acceding to 
 

 
the (pointless, harmless) presence of RR MEP, but it isn't a finished 
 
job, 
 
unless we address this possibility in one of the two ways I have raised. 
 

 
There is nothing in the current spec to stop a faithful implementation 
receiving a duplicate RR and directing it at a state machine that will 
fault it.

Furthermore, and taking off from another of your comments: we could 
introduce a statement into WS-C (there is none there now) which stated 
that duplicate RegisterResponses are discarded. This would be contrary 
 
to 
 
the resolution of 007 which contains the statement: 
The manner in which the participant handles duplicate protocol messages 
depends
on the specific coordination type and coordination protocol.
Even if we introduced a textual statement on discards in the AT and BA 
specs, we are not finished with the problem. The whole RegisterResponse 
row of the AT state table has to cope with the arrival of a duplicate 
RegisterResponse (late, out of order). At present that row incorrectly 
faults a late duplicate, when in fact the duplicate RR should always be 
thrown away. This strongly indicates that the AT state table is the 
 
place 
 
to define all duplicate RR behaviours.

I assume that the same will apply to BA.

Alastair





Mark Little wrote: 
Hi Alastair. Apologies for the delay in replying, but I was traveling 
 
last 
 
week. Comments inline ... 


Hi, 

This mail is being sent to everyone who is an editor of WS-C, WS-AT or 
WS-BA. 

I've picked up the AI from the TX TC meeting of 23 Feb to propose (in 
conjunction with you the editors) a concrete proposal for 009, based on 
the premise that the RR MEP is going to stay. 

To kick this off, before putting work into writing a concrete change 
proposal, I want to suggest how to address this in principle. Please let 
 

 
me know if you agree, disagree, have amendments to this approach. 

1. The RR MEP is primarily used by WS-Coordination: the parts of WS-AT 
Section 9 that deal with that MEP should be moved to WS-C. There are 
normative references to the effect of Register/RegisterResponse in the 
WS-AT state tables, but these references have nothing to do with the 
character of these messages as RR MEP messages. 

Agreed. Are you also proposing updating the state table in WS-C to cover 
 

 
the WS-AT case you mentioned above? 


2. The one-way MEP is not used by WS-C, but is used by WS-AT and WS-BA. 
I would suggest that we produce text which covers the use of this MEP, 
and reproduce it in both specs separately. Alternative would be to put 
the text in WS-AT and then cross-reference in WS-BA. In this context I 
think it would be easier to cut-and-paste than do an x-ref (section 
numbers will change etc). 

I'd go with copy-and-paste too. 


3. I would like you to revisit, if you get a chance, the original 009 
issue to see what I proposed in terms of wording re terminal and 
non-terminal. There was a bit of discussion on e-mail between Tom and 
me, back in November or December, about refining that. I'd like to 
create consensus between us on what the shape of that rewording should 
be, if rewording is needed. 

Please note that I believe that the outcome of the resolution of 007 is 
that the RR MEP message-id based correlation need never be used, and is 
therefore fundamentally pointless (if harmless). The rules for duplicate 
 

 
processing are partly expressed in the existing WS-AT state tables, but 
need amplification (i..e 007 is not fully resolved, in respect of 
duplicate Register/RegisterResponse processing). I would welcome 
contrary comments if anyone believes we need to say anything about the 
use of the message-id/reply-to features beyond what is currently stated 
in WS-AT S. 9, i.e. that they have to be present (even if they need 
never be used). You could say something like: you can use the reply-to 
to detect duplicates RRs and therefore eliminate the message before it 
hits the P state machine, but in specification terms that is just a 
restatement of the 2PC PV state table (if it were correctly specified). 

I would have thought that we can ignore duplicate detection and 
elimination at the level of WS-A: surely that will be taken care of by 
 
the 
 
"underlying" infrastructure (not trying to impose any specific 
implementation here, but hopefully you get my point)? 

Mark. 


Yours, 

Alastair

References:
- Re: [ws-tx] Re: Editorializing on MEPs etc
  - From: Andrew Wilkinson3 <awilkinson@uk.ibm.com>