ws-rx message

Subject: Re: [ws-rx] i119: EPR comparisons overly restrictive

From: Alastair Green <alastair.green@choreology.com>
To: Jonathan Marsh <jmarsh@microsoft.com>
Date: Wed, 24 May 2006 16:36:03 +0100

Couple of points:

1) What about the second case of EPR comparison (the MakeConnection stuff)? Don't know the status of that, but if you want the full control over new sequences etc outlined in the IBM proposal, then you have an identification issue, and in an area which is (very) sensitive to false negatives. But, the function of the id I am suggesting is actually the same in both cases: identify who you are talking to. Might generate less engineering.

2) Not sure about your comments on char-for-char. In the URI case that the RFCs discuss, there is a concept of "string". But the ref params are completely opaque. Can you do anything safe other than literally compare bit for bit?

Alastair

Jonathan Marsh wrote:

That sounds workable to me.  My preference, having been involved in
enough over-engineered standards work that was subsequently punished by
the industry, remains for the minimum necessary to declare victory.  A
new mechanism that optionally facilitates the use of an optional
optimization still seems like overkill to me.

And FWIW, I consider bit-for-bit (more precisely, char-for-char) as a
respectable choice for a comparison algorithm.  It will generate a fair
amount of false negatives but is dead simple to implement.

-----Original Message-----
From: Alastair Green [mailto:alastair.green@choreology.com]
Sent: Tuesday, May 23, 2006 10:00 PM
To: Jonathan Marsh
Cc: Anish Karmarkar; ws-rx@lists.oasis-open.org
Subject: Re: [ws-rx] i119: EPR comparisons overly restrictive

Jonathan,

Looking over the wall from my usual habitat in TX, I've been following

all

of this for a while, because of related issues dealt with in TX. A

quick

thought follows. Apologies in advance if it duplicates or

misunderstands.

Perhaps you could walk right round this by defining an RX WS-A

extension

element, which is of a clearly comparable type, and globally

unambiguous,

e.g. absolute URI. "Interlocutor identity", "endpoint equivalence

marker"

or some such.

If EPR A and EPR B (or EPR A') have such an element, both with value

of

"foo" then you can send ack to EPR A on a message to EPR B. You could

re-

use this same id for the client identification issue for

MakeConnection

too: reply EPR = anon/polling + interlocutor id.

I think you're right, that in this particular context where false
negatives are just deoptimizations of the attempted optimization,
"guessing the EPR equivalence" will work, and will, by evolution,

chuck

out the bad guessers ( in reality, push most implementations to the
conservative case of EPR comparison, e.g. bit-for-bit for [address]

and

[ref params], or indeed to abandon the optimization).

In the client id case for GetMessage/MakeConnection I think you have

the

problem that false negatives are dangerous. If you misidentify a

request

to respond as coming from e.g. an unknown client then the two sides

will

freeze up.

Overall, there seems to be a latent concept of "party to a

conversation",

which might usefully be surfaced.

Alastair

Jonathan Marsh wrote:

	I think you and I are pretty close on this.  We seem to agree

that

it is
	desirable to allow a range of options from no optimization, to a
simple
	bit-for-bit comparison (though there may even be flavors of

this),

	through a canonicalized approach, through to what I'll call

semantic

	equivalence.   An implementation should be free to choose the

level

it
	wishes to invest in, with the understanding that the more you
invest,
	the more optimization dividends you receive.

	We don't seem to agree on the value of providing a non-normative
	snapshot of a comparison algorithm in this spec.  IMO, such an
algorithm
	is likely to miss the sweet spot, provide an illusion of
	interoperability when in fact it's just documentation of an
	implementation detail, and consume a lot of TC time to work out
details,
	consume a lot of reader time to, and increase the burden of

ongoing

	maintenance of the spec.

	I don't have a problem addressing the specific issue raised

during

	interop that reference parameters should be considered when
determining
	EPR equivalence.  I think the simplest way to address this issue

is

by
	adding a simple warning to implementers along the lines of:

"Note

that
	reference parameters should be considered when determining EPR
	equivalence."  In this case the "very least" is perfectly
sufficient.



		-----Original Message-----
		From: Anish Karmarkar

[mailto:Anish.Karmarkar@oracle.com]

		Sent: Monday, May 22, 2006 7:25 PM
		To: Jonathan Marsh
		Cc: ws-rx@lists.oasis-open.org
		Subject: Re: [ws-rx] i119: EPR comparisons overly

restrictive

		These are all very good point.
		But from them I don't necessarily conclude that we

should not

define


	an


		EPR comparison algorithm.

		1) WRT to false negatives: I don't think false negatives

are

	themselves


		a problem -- they result in one not being able to

utilize the

		optimization. I would hope that if we define a

comparison

algorithm,


	it


		would be stated in a way that would *not* prevent anyone

from

defining
		additional mechanisms to figure out equivalence.
		For example, if 2 EPRs are bit for bit the same then

they are

the


	same.


		If they are not (in absence of an equivalence

algorithm), then

one
		cannot make any statement about the equivalence (they

may be

	equivalent


		or not, one just doesn't know). There may be additional
metadata that
		tells me that they are (or not). For example, I may know

that

		http://www.w3.org and http://www.w3c.org are the same.

One

should be
		able to add additional criterion (than the one that we

define

in WSRX,
		if we do at all) to figure out equivalence.

		2) WRT to 3a and 3b: if we define a comparison

algorithm, an

		implementation is not (or should not be) forced to use

them.

It may
		choose to ignore it (and only use, say, bit-for-bit
comparison) at the
		cost of extra messages and a saving on processing. OR if

it is

willing
		to put additional resources to prevent extra messages,

it may

use
		additional criterion on top of what we define.

		One of the feedbacks that I got from the interop was

that some

		implementations ignore refps when figuring out

equivalence --

this is
		quite bad for interop. If we don't define an equivalence
algorithm, at
		the very lease we should include a warning about EPR
comparisons and
		pitfalls.

		-Anish
		--

		Jonathan Marsh wrote:


			I don't have any religious objection to defining

EPR

comparison
			mechanisms in specific circumstances (as

WS-Addressing

allows), but


	in


			this case I fail to see why defining such a

mechanism is

a good


	thing.


			In fact it seems harmful.



			1) Piggybacking is an optimization, although an
important one.


	Failure


			to correctly determine that two EPRs are

equivalent for

the purposes


	of


			piggybacking eliminates the possibility to

benefit from

the
			optimization.  The nature of the optimization is

in

reducing the


	number


			of messages sent on the wire.



			2) EPR comparison is tricky:

			  a) URI comparison is notoriously difficult,

unless the

URI is


	simply


			an identifier (e.g. namespace URI) and not

primarily

intended to be
			dereferenced, in which case simple string

comparison

suffices.  I


	don't


			believe this will suffice for [address] property

values.

The


	current


			reference to 2396 is confusing and inadequate to
describe how URIs


	are


			to be compared.  See

http://www.ietf.org/rfc/rfc3987.txt

section 5


	for a


			more thoughtful (and applicable) explanation of

why

there isn't a


	single


			right mechanism.

			  b) Comparing canonicalized representations of
reference parameters


	is


			also difficult.  There might be a broad set of

allowable

	manipulations


			to reference parameters that result in false

positives,

e.g.


	annotating


			a ref param with an extension attribute (even

WS-A does

that).



			3) Despite the choices made above, both

components of

the proposed


	EPR


			comparison can result in false negatives.

Trading off

complexity of


	the


			comparison algorithm versus the amount of

optimization

is a choice


	that


			should be left to implementations.

			  a) Canonicalization and full IRI normalization

are

fairly


	expensive.


			Some implementations may prefer a simpler

mechanism,

with
			correspondingly higher false negative rates.

			  b) Network messaging is fairly expensive.  An
implementation


	wishing


			to expend extra computational resources to

minimize the

network


	traffic


			should be free to implement as comprehensive a
comparison mechanism


	as


			they like.



			A standardized EPR comparison mechanism writes

into the

standard
			optimization choices that should be left to
implementers, and would
			force some implementations to become needlessly

(from

their


	perspective)


			complex, and others to dumb-down and miss

opportunities

for better
			optimizations.  Some seem to fear incorrectly
piggybacking messages


	not


			destined to the message recipient, which is

again an

implementation
			issue (a bug) which the market will easily sort

out.

    

References:
- RE: [ws-rx] i119: EPR comparisons overly restrictive
  - From: "Jonathan Marsh" <jmarsh@microsoft.com>