[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Subject: Re: [wsrp-wsia] [change request #143] Properly encode '&' in examplesand BNF
I think I've got it, you are saying that "Query Strings in well formed documents must use & to separate name/value pairs, the use of & is not correct and it may break some XML clients". Correct? Michael Freedman wrote: > What if the content being returned to the consumer is an XML document > that contains the link in your sample? Won't the resulting XML > document be invalid -- i.e. things would fail if/when you ran an XML > processor over it -- for example to apply a stylesheet? I.e. in your > example if the consumer/user-agent treats the markup you show as an > XML document vs. an HTML document then the content is invalid because > you haven't used & vs. the &. > -Mike- > > Alejandro Abdelnur wrote: > >> I don't get it, are you saying that the consumer rewriting happens >> before the you get the content from the WS stack? That's odd. >> >> Doing a step by step I don't see a problem: >> >> A portlet creates content using XML escaping rules only if it wants >> to use a special XML character (<, >, &) and it wants it to be >> displayed [instead of being intepreted as XML special character] by >> the user agent. If the stack needs to encode this to send it over the >> wire (producer to consumer), the stack will decode it. >> >> The portlet creates the following content: >> >> <B>Hello, <A >> HREF="wsrp-rewrite?wsrp-urlType=action&message=greetings/wsrp-rewrite">click >> here!</A></ B> >> >> The producer WS stack XML escapes it: >> >> <B>Hello, <A >> HREF="wsrp-rewrite?wsrp-urlType=action&message=greetings/wsrp-rewrite">click >> here!</A></ B> >> >> The producer sends the content to the consumer. The consumer WS stack >> decodes it back to what the portlet originally created. The consumer >> looks for templates to rewrite 'wsrp-rewrite? .... /wsrp-rewrite', >> and replaces the template with a well formed URL: >> >> <B>Hello, <A >> HREF="http://foo.com?target=myPorltet&wsrp-urlType=action&message=greetings">click >> here!</A></ B> >> >> Consumer creates portal page with this content and sends it back to >> user-agent without further escaping/encoding. >> >> The portlet is responsible for doing URL escaping (using the %HH) for >> characters that have special meaning r are not valid in the >> querystring. For example, if you want the user agent and the server >> receiving the request to see & as a regular character instead >> interpreting it as a name/value separator you use %26 instead &. But >> this is not affected by the XML encoding done by the WS stack. >> >> On the way back the URL is processed correctly by the consumer and it >> may undergo an XML encoding when going from consumer to producer but >> this is done by the WS stack. >> >> Alejandro >> >> >> Michael Freedman wrote: >> >>> What you are missing is the difference between what is transported >>> over the wire and what the consumer/client sees. Yes, the & is >>> transformed by the underlying stack to & to carry in the soap >>> message -- however on the other side the consumer/client sees the >>> value as merely &. Though we could claim that consumer rewriting >>> occurs before other processing and hence could replace the & with an >>> & what do we do in the template case? Once we account for >>> & in producer templates shouldn't we be consistent with consumer >>> rewriting? Finally note, though I suggested we merely require & >>> use rather then both we will need to think this through carefully. >>> There is a [slight] performance impact on supporting both but we >>> need to ensure that & is valid for all document types/browsers >>> whether the document type be XML based/related or not. If we can't >>> convince ourselves of this then we will probably need to support >>> both forms -- the semantics are easy for the consumer -- merely pass >>> on what you receive [assume the producer did the right thing]. >>> Producer templates however get ugly at it would seem we would need >>> to pass an XML friendly form and regular form doubling the number of >>> templates we carry. >>> -Mike- >>> >>> Alejandro Abdelnur wrote: >>> >>>> Wouldn't this be taken care by the XML encoding that happens when >>>> you put the content into the SOAP response? Same as the < and > ? >>>> What am I missing? >>>> >>>> Rich Thompson wrote: >>>> >>>>> Document: Spec >>>>> Section: 10.2.1 and 10.2.2 >>>>> Page/Line: 58/15 and 61/24 >>>>> Requested by: Mike Freedman >>>>> Old text: >>>>> wsrp-rewrite?wsrp-urlType=value&name1=value1&name2=value2.../wsrp-rewrite >>>>> >>>>> >>>>> New text: wsrp-rewrite?wsrp-urlType&name1=value1&name2=value2 >>>>> .../wsrp-rewrite >>>>> >>>>> Reasoning: If the content containing the wsrp-rewrite is XML them >>>>> the use of & makes its on invalid document. Rather the & must be >>>>> expressed as & We should just make this form the standard use >>>>> rather then supporting both flavors as supporting both has a >>>>> negative impact on the implementation/performance of the consumer >>>>> rewrite code -- something that needs to stay as efficient as >>>>> possible. Note: we should also change the producer template URL >>>>> to use & as we don't know the content the producer will >>>>> produce. Finally, we should chaneg the BNF. FYI ... what follows >>>>> is a brief paragraph from the XHTML spec explaining the above >>>>> problem: >>>>> >>>>> C.12. Using Ampersands in Attribute Values (and Elsewhere) >>>>> In both SGML and XML, the ampersand character ("&") declares the >>>>> beginning of an entity reference (e.g., ® for the registered >>>>> trademark symbol "®"). Unfortunately, many HTML user agents have >>>>> silently ignored incorrect usage of the ampersand character in >>>>> HTML documents - treating ampersands that do not look like entity >>>>> references as literal ampersands. XML-based user agents will not >>>>> tolerate this incorrect usage, and any document that uses an >>>>> ampersand incorrectly will not be "valid", and consequently will >>>>> not conform to this specification. In order to ensure that >>>>> documents are compatible with historical HTML user agents and >>>>> XML-based user agents, ampersands used in a document that are to >>>>> be treated as literal characters must be expressed themselves as >>>>> an entity reference (e.g. "&"). For example, when the href >>>>> attribute of the a element refers to a CGI script that takes >>>>> parameters, it must be expressed as >>>>> http://my.site.dom/cgi-bin/myscript.pl?class=guest&name=user >>>>> rather than as >>>>> http://my.site.dom/cgi-bin/myscript.pl?class=guest&name=user. >>>>> >>>>> >>>>> >>>>> >>>>> ---------------------------------------------------------------- >>>>> To subscribe or unsubscribe from this elist use the subscription >>>>> manager: <http://lists.oasis-open.org/ob/adm.pl> >>>>> >>>>> >>>> >>>> >>>> >>>> ---------------------------------------------------------------- >>>> To subscribe or unsubscribe from this elist use the subscription >>>> manager: <http://lists.oasis-open.org/ob/adm.pl> >>> >>> >>> >>> >>> >>> >>> >>> ---------------------------------------------------------------- >>> To subscribe or unsubscribe from this elist use the subscription >>> manager: <http://lists.oasis-open.org/ob/adm.pl> >> >> >> >> > > > > ---------------------------------------------------------------- > To subscribe or unsubscribe from this elist use the subscription > manager: <http://lists.oasis-open.org/ob/adm.pl>
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Powered by eList eXpress LLC