A somewhat different approach would be to disallow "special" characters
in the portlet-provided values, e.g., navigational state,
etc.
This would mean that a portlet can only use, say, URL-allowed characters
(alphanumeric+), but not /, #, :, ?, =, ;, %, etc. (the full list is documented
in the corresponding RFC).
This will make it easy for the Consumer to process (no special
handling, no dependency on Web server), and will also make it easy
for portlets and Producers (easy rules, the portlet can always do one level
of escaping and replace the % with, say a dash (-).
Escaping seems to be one of the more common sources for incompatibility
and it may just be safer to sidestep rather than spend a lot of time to ensure
that compatibility across Producers Consumers and Web
servers.
My two cents,
Eilon
Rich,
Unfortunately, the collapsing of // to / also happens within
parameters not just to the separators, e.g. the navigationalState
parameter value written into the template by the producer is "a//b"
but is modified to "a/b" following normal url path processing
rules.
The
whole path is subjected to this collapsing. The consumer can protect itself
from this for its own parameter separators as you suggest (and even have
special smarts for wsrp-url values), but the producer should not need to
special case any "//" in its URL parameter values (wsrp-navigationalState
and wsrp-interactionState are the only difficult parameters). That is why
I'm suggesting doing a double encoding. We could instead suggest no
consecutive '/'s allowed in navigationalState and interactionState but that
seems very arbitrary.
regards,
Andre
It would be a good editorial clarification to comment that all
parameter values should be URL encoded since they will appear in the URL
activated by an End-User interaction. Does anyone object to adding such a
comment?
As to the particular
question about collapsing // into /, the Consumer can easily prevent this by
using a construct such as /_ to separate parameters (or "/ns_" before the
navigational state if so desired). I think we can leave it the Consumer's
responsibility to properly construct its templates.
Rich Thompson
| Andre Kramer
<andre.kramer@eu.citrix.com>
03/28/2003 10:48 AM
|
To: wsrp@lists.oasis-open.org
cc:
Subject:
[wsrp] WSRP url parameters and URL Encoding
questions |
In
10.2.1.1.4.1 we advice "wsrp-url" values to be URL Encoded.
However,
we are silent on the remaining consumer rewriting tokens and
their producer URL writing counterparts. But the obvious thing to do is
to URL encode them all (i.e. if {wsrp-navigationalState} contains an
"&" or a "/" etc then URL Encode it. Or encode it anyway just to be
safe).
Having URL Encoded all parameters, for producer template URL
activation , the web server may even help out and do the URL decode for
you.
Furthermore, in order to support method GET, URL templates must
avoid query strings. One strategy is to use a path ("/") instead, but I
have found that (after the above helpful URL decode) some Web Servers
will replace any "//" with a "/"!
[A valid transformation for file
paths as, e.g. file path a///b == file path a/b, but not great if one is
encoding data as a path. We should not force consumers to use "#" or ";"
instead of "?", as these also have issues.]
Obviously this
corrupts any (URL template) parameters that contain consecutive back
slashes, and we can not expect producers to know what URL structure the
consumer is using for it's templates. Both the decode and collapsing
consecutive "/"s seem valid things for the Web server to do but they are
interacting with our method=GET work around. What could we
do?
The simplest solution seems to be to *double* URL Encode
values when replacing a {wsp-someparameter} in templates. By double
encode I mean {wasp-paramValue}
= HttpUtility.UrlEncode(HttpUtility.UrlEncode(RawParamValue))
or {wsrp-paramValue} =
URLEncoder.encode(URLEncoder.encode(rawParamValue));
[A single URL
encode should be enough for consumer rewriting. The consumer can apply a
second encode on re-writing (if required).]
This double encode does
seem onerous at first, but has the advantage of being always safe and
independent of template schemes and usesMethodGet (as well as constant
for a Web Server environment & avoids us inventing a new encoding
scheme).
What do people
think?
regards, Andre
|