URL encoding is left up to the
entity. It should also be the responsibility of the entity to decode any
possible client parameters. The issue was raised that some server side
APIs (IIS for example) always perform a URL decode of parameters. Some
consumer implmentations may find it a hassle to pass un-decoded params to the
producer. We need to decide if the consumer should always pass
back the same url encoded parameter, and to prohibit the consumer from
decoding them.
[Jane Dynin] Let's assume for the moment that the
consumer is prohibited from decoding the parameters, it does not decode them
and then re-encode them. Consider the following
scenario:
Producer creates an URL that has some parameters,
which include Japanese characters. Producer returns the data to the
consumer, and decides to return everything in the Shift-JIS (Japanese)
codepage, and thus the parameters are encoded in Shift-JIS.
Consumer is doing the URL rewriting. Also, for
matters of simplicity, consumer only wants to talk UTF-8 to the producer. All
XML parsers must support UTF-8, so it's a logical choice for the consumer to
use UTF-8. Now, it must send a request to the producer that contains
those parameters. What does the consumer do? There are several
options:
1. We could prohibit this situation from happenning
by allowing only ASCII characters in parameters.
2. We could say that the consumer-producer
interaction has to happen in UTF-8.
3. Consumer could figure out what encoding the
producer was using, somehow store it, and transcode the parameters from
Shift-JIS to UTF-8.
I think it would be much simplier if we chose option
2, at least for the 1.0 WSRP spec. There are lots of weird
internationalization issues that can creep up, and establishing that all
communication between the producer and the consumer has to happen in UTF-8
will definitely simplify matters. In addition, there is no extra burden
for the portlet writers since XML has built-in support for UTF-8. UTF-8
is also compatible with lower ASCII -- files and strings which contain only
7-bit ASCII characters have the same encoding under both ASCII and
UTF-8.