OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

xri message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: Re: [xri] Media Type parameters

Gabe Wachob wrote:
> The problem is that these are media-type parameters, and are used in
> multiple places. Its specified in RFC 2045. 
> Its interesting that these libraries are treating ; as a separator - the
> query string is a sort of no-man's land for specs. There was a sort of de
> facto spec because CGI specified how to parse the querystring for the
> purpose of passing parameters.
> The note you refer to (b.2.2 of the HTML 4.01 spec) is talking about how to
> encode a parameter list in HTML (though I think it may apply to XML as
> well). The use of ';' instead of & is NOT very common) - so I wonder why you
> are seeing this behavior in PHP and Ruby... Granted, its relatively rare to
> see ; in the querystring so it may just be that people don't run across this
> issue very often.
Good catch, Victor!

This is the sentence that we're concerned with in b.2.2 of HTML4.01:

We recommend that HTTP server implementors, and in particular, CGI 
implementors support the use of ";" in place of "&" to save authors the 
trouble of escaping "&" characters in this manner.

So even though most user agents (I don't know of any!) do not encode 
form data using semicolon as a separator, it is after all a 
recommendation from w3c. Therefore, it comes as no surprise that PHP 
treats it so. That's why was was pushing for any non-ASCII character to 
be %-escaped.

As for PHP, by default the arg_separator.input value is "&", but in some 
distro you might see ";&" which will use either character to break the 
query string or post data up.
If you did a "php -i | grep arg_separator.input" you will probably see "&;"

> While I'm deeply sympathetic to implementation issues, I'm really worried
> about making the spec change like this (we'd have to change the way we
> communicate parameters to media types) - besides breaking current
> implementations, I'm worried about the complication of carrying around these
> media type parameters separately from the media types themselves (for
> example, in the Accept: header of an HTTP request). 

I agree that it is probably too late to change it now, though this is 
the opposite case of allowing bare "+" on the url query string in 
addition to %2B. In the "+" case, parsing it as a space character is 
legacy behavior and is not supported by any standard. While it is 
impossible that browsers will suddenly switch to ";", God knows if 
someone may come up with a <form separate=";"> attribute to hint at the 
browser to encode parameters with ";". Or we may see more server-side 
languages start to support it, breaking our stuff.

While I really hate to see the QXRI getting uglier than it already is, I 
would vote for it to be %-escaped.

This should be a backwards-compatible change..


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]