ws-rx message

Subject: RE: [ws-rx] i119: EPR comparisons overly restrictive
From: "Marc Goodner" <mgoodner@microsoft.com>
To: "Ashok Malhotra" <ashok.malhotra@oracle.com>,"Jonathan Marsh" <jmarsh@microsoft.com>,"Alastair Green" <alastair.green@choreology.com>
Date: Wed, 24 May 2006 14:06:04 -0700
So as long as we are talking about how to specify a char by char
comparison I have a few other questions about what we would need to
account for in an EPR comparison.

a) Do multiple wsa:ReferenceParameters attributes have to be in the same
order?  WS-Addressing seems to permit them to be re-ordered. That seems
to imply any comparison algorithm would have to take those permutations
into consideration. Would we need to specify that?
b) Does the comparison depend on any wsa:Metadata elements?
c) Does the comparison depend on any extension elements or attributes
not defined directly by WS-Addressing Rec?

Marc Goodner
Technical Diplomat
Microsoft Corporation
Tel: (425) 703-1903
Blog: http://spaces.msn.com/mrgoodner/ 


-----Original Message-----
From: Ashok Malhotra [mailto:ashok.malhotra@oracle.com] 
Sent: Wednesday, May 24, 2006 7:01 AM
To: Jonathan Marsh; Alastair Green
Cc: Anish Karmarkar; ws-rx@lists.oasis-open.org
Subject: RE: [ws-rx] i119: EPR comparisons overly restrictive

Jonathan:
sorry to be picky but you also need to specify the collation used for a
char-by-char comparison.  The XQuery F&O document recommends the Unicode
Codepoint Collation which seems right to me.

All the best, Ashok
 

> -----Original Message-----
> From: Jonathan Marsh [mailto:jmarsh@microsoft.com]
> Sent: Wednesday, May 24, 2006 6:44 AM
> To: Alastair Green
> Cc: Anish Karmarkar; ws-rx@lists.oasis-open.org
> Subject: RE: [ws-rx] i119: EPR comparisons overly restrictive
> 
> That sounds workable to me.  My preference, having been involved in 
> enough over-engineered standards work that was subsequently punished 
> by the industry, remains for the minimum necessary to declare victory.

> A new mechanism that optionally facilitates the use of an optional 
> optimization still seems like overkill to me.
> 
> And FWIW, I consider bit-for-bit (more precisely,
> char-for-char) as a respectable choice for a comparison algorithm.  It

> will generate a fair amount of false negatives but is dead simple to 
> implement.
> 
> > -----Original Message-----
> > From: Alastair Green [mailto:alastair.green@choreology.com]
> > Sent: Tuesday, May 23, 2006 10:00 PM
> > To: Jonathan Marsh
> > Cc: Anish Karmarkar; ws-rx@lists.oasis-open.org
> > Subject: Re: [ws-rx] i119: EPR comparisons overly restrictive
> > 
> > Jonathan,
> > 
> > Looking over the wall from my usual habitat in TX, I've
> been following
> all
> > of this for a while, because of related issues dealt with in TX. A
> quick
> > thought follows. Apologies in advance if it duplicates or
> misunderstands.
> > 
> > Perhaps you could walk right round this by defining an RX WS-A
> extension
> > element, which is of a clearly comparable type, and globally
> unambiguous,
> > e.g. absolute URI. "Interlocutor identity", "endpoint equivalence
> marker"
> > or some such.
> > 
> > If EPR A and EPR B (or EPR A') have such an element, both with value
> of
> > "foo" then you can send ack to EPR A on a message to EPR B. 
> You could
> re-
> > use this same id for the client identification issue for
> MakeConnection
> > too: reply EPR = anon/polling + interlocutor id.
> > 
> > I think you're right, that in this particular context where false 
> > negatives are just deoptimizations of the attempted optimization, 
> > "guessing the EPR equivalence" will work, and will, by evolution,
> chuck
> > out the bad guessers ( in reality, push most implementations to the 
> > conservative case of EPR comparison, e.g. bit-for-bit for [address]
> and
> > [ref params], or indeed to abandon the optimization).
> > 
> > In the client id case for GetMessage/MakeConnection I think you have
> the
> > problem that false negatives are dangerous. If you misidentify a
> request
> > to respond as coming from e.g. an unknown client then the two sides
> will
> > freeze up.
> > 
> > Overall, there seems to be a latent concept of "party to a
> conversation",
> > which might usefully be surfaced.
> > 
> > Alastair
> > 
> > Jonathan Marsh wrote:
> > 
> > 	I think you and I are pretty close on this.  We seem to agree
> that
> > it is
> > 	desirable to allow a range of options from no
> optimization, to a
> > simple
> > 	bit-for-bit comparison (though there may even be flavors of
> this),
> > 	through a canonicalized approach, through to what I'll call
> semantic
> > 	equivalence.   An implementation should be free to choose the
> level
> > it
> > 	wishes to invest in, with the understanding that the
> more you invest,
> > 	the more optimization dividends you receive.
> > 
> > 	We don't seem to agree on the value of providing a non-normative
> > 	snapshot of a comparison algorithm in this spec.  IMO, such an 
> > algorithm
> > 	is likely to miss the sweet spot, provide an illusion of
> > 	interoperability when in fact it's just documentation of an
> > 	implementation detail, and consume a lot of TC time to work out 
> > details,
> > 	consume a lot of reader time to, and increase the burden of
> ongoing
> > 	maintenance of the spec.
> > 
> > 	I don't have a problem addressing the specific issue raised
> during
> > 	interop that reference parameters should be considered when 
> > determining
> > 	EPR equivalence.  I think the simplest way to address this issue
> is
> > by
> > 	adding a simple warning to implementers along the lines of:
> "Note
> > that
> > 	reference parameters should be considered when determining EPR
> > 	equivalence."  In this case the "very least" is
> perfectly sufficient.
> > 
> > 
> > 
> > 		-----Original Message-----
> > 		From: Anish Karmarkar
> [mailto:Anish.Karmarkar@oracle.com]
> > 		Sent: Monday, May 22, 2006 7:25 PM
> > 		To: Jonathan Marsh
> > 		Cc: ws-rx@lists.oasis-open.org
> > 		Subject: Re: [ws-rx] i119: EPR comparisons overly
> restrictive
> > 
> > 		These are all very good point.
> > 		But from them I don't necessarily conclude that we
> should not
> > define
> > 
> > 
> > 	an
> > 
> > 
> > 		EPR comparison algorithm.
> > 
> > 		1) WRT to false negatives: I don't think false negatives
> are
> > 
> > 
> > 	themselves
> > 
> > 
> > 		a problem -- they result in one not being able to
> utilize the
> > 		optimization. I would hope that if we define a
> comparison
> > algorithm,
> > 
> > 
> > 	it
> > 
> > 
> > 		would be stated in a way that would *not* prevent anyone
> from
> > defining
> > 		additional mechanisms to figure out equivalence.
> > 		For example, if 2 EPRs are bit for bit the same then
> they are
> > the
> > 
> > 
> > 	same.
> > 
> > 
> > 		If they are not (in absence of an equivalence
> algorithm), then
> > one
> > 		cannot make any statement about the equivalence (they
> may be
> > 
> > 
> > 	equivalent
> > 
> > 
> > 		or not, one just doesn't know). There may be
> additional metadata
> > that
> > 		tells me that they are (or not). For example, I may know
> that
> > 		http://www.w3.org and http://www.w3c.org are the same.
> One
> > should be
> > 		able to add additional criterion (than the one that we
> define
> > in WSRX,
> > 		if we do at all) to figure out equivalence.
> > 
> > 		2) WRT to 3a and 3b: if we define a comparison
> algorithm, an
> > 		implementation is not (or should not be) forced to use
> them.
> > It may
> > 		choose to ignore it (and only use, say, bit-for-bit
> > comparison) at the
> > 		cost of extra messages and a saving on processing. OR if
> it is
> > willing
> > 		to put additional resources to prevent extra messages,
> it may
> > use
> > 		additional criterion on top of what we define.
> > 
> > 		One of the feedbacks that I got from the interop was
> that some
> > 		implementations ignore refps when figuring out
> equivalence --
> > this is
> > 		quite bad for interop. If we don't define an
> equivalence algorithm,
> > at
> > 		the very lease we should include a warning
> about EPR comparisons and
> > 		pitfalls.
> > 
> > 		-Anish
> > 		--
> > 
> > 		Jonathan Marsh wrote:
> > 
> > 
> > 			I don't have any religious objection to defining
> EPR
> > comparison
> > 			mechanisms in specific circumstances (as
> WS-Addressing
> > allows), but
> > 
> > 
> > 	in
> > 
> > 
> > 			this case I fail to see why defining such a
> mechanism is
> > a good
> > 
> > 
> > 	thing.
> > 
> > 
> > 			In fact it seems harmful.
> > 
> > 
> > 
> > 			1) Piggybacking is an optimization,
> although an important one.
> > 
> > 
> > 	Failure
> > 
> > 
> > 			to correctly determine that two EPRs are
> equivalent for
> > the purposes
> > 
> > 
> > 	of
> > 
> > 
> > 			piggybacking eliminates the possibility to
> benefit from
> > the
> > 			optimization.  The nature of the optimization is
> in
> > reducing the
> > 
> > 
> > 	number
> > 
> > 
> > 			of messages sent on the wire.
> > 
> > 
> > 
> > 			2) EPR comparison is tricky:
> > 
> > 			  a) URI comparison is notoriously difficult,
> unless the
> > URI is
> > 
> > 
> > 	simply
> > 
> > 
> > 			an identifier (e.g. namespace URI) and not
> primarily
> > intended to be
> > 			dereferenced, in which case simple string
> comparison
> > suffices.  I
> > 
> > 
> > 	don't
> > 
> > 
> > 			believe this will suffice for [address] property
> values.
> > The
> > 
> > 
> > 	current
> > 
> > 
> > 			reference to 2396 is confusing and
> inadequate to describe how URIs
> > 
> > 
> > 	are
> > 
> > 
> > 			to be compared.  See
> http://www.ietf.org/rfc/rfc3987.txt
> > section 5
> > 
> > 
> > 	for a
> > 
> > 
> > 			more thoughtful (and applicable) explanation of
> why
> > there isn't a
> > 
> > 
> > 	single
> > 
> > 
> > 			right mechanism.
> > 
> > 			  b) Comparing canonicalized
> representations of reference
> > parameters
> > 
> > 
> > 	is
> > 
> > 
> > 			also difficult.  There might be a broad set of
> allowable
> > 
> > 
> > 	manipulations
> > 
> > 
> > 			to reference parameters that result in false
> positives,
> > e.g.
> > 
> > 
> > 	annotating
> > 
> > 
> > 			a ref param with an extension attribute (even
> WS-A does
> > that).
> > 
> > 
> > 
> > 			3) Despite the choices made above, both
> components of
> > the proposed
> > 
> > 
> > 	EPR
> > 
> > 
> > 			comparison can result in false negatives.
> Trading off
> > complexity of
> > 
> > 
> > 	the
> > 
> > 
> > 			comparison algorithm versus the amount of
> optimization
> > is a choice
> > 
> > 
> > 	that
> > 
> > 
> > 			should be left to implementations.
> > 
> > 			  a) Canonicalization and full IRI normalization
> are
> > fairly
> > 
> > 
> > 	expensive.
> > 
> > 
> > 			Some implementations may prefer a simpler
> mechanism,
> > with
> > 			correspondingly higher false negative rates.
> > 
> > 			  b) Network messaging is fairly
> expensive.  An implementation
> > 
> > 
> > 	wishing
> > 
> > 
> > 			to expend extra computational resources to
> minimize the
> > network
> > 
> > 
> > 	traffic
> > 
> > 
> > 			should be free to implement as
> comprehensive a comparison mechanism
> > 
> > 
> > 	as
> > 
> > 
> > 			they like.
> > 
> > 
> > 
> > 			A standardized EPR comparison mechanism writes
> into the
> > standard
> > 			optimization choices that should be
> left to implementers, and would
> > 			force some implementations to become needlessly
> (from
> > their
> > 
> > 
> > 	perspective)
> > 
> > 
> > 			complex, and others to dumb-down and miss
> opportunities
> > for better
> > 			optimizations.  Some seem to fear
> incorrectly piggybacking messages
> > 
> > 
> > 	not
> > 
> > 
> > 			destined to the message recipient, which is
> again an
> > implementation
> > 			issue (a bug) which the market will easily sort
> out.
> > 
> > 
> > 
> > 
> > 
> > 
> 
> 
>
References:
- RE: [ws-rx] i119: EPR comparisons overly restrictive
  - From: "Jonathan Marsh" <jmarsh@microsoft.com>
- RE: [ws-rx] i119: EPR comparisons overly restrictive
  - From: "Ashok Malhotra" <ashok.malhotra@oracle.com>