[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Xref escaping rules
Here's the escaping proposal that I brought up on the last TC call. The issue is that an XRI or URI is used as a cross-reference may contain characters that must be escaped in order to conform with 2396. Specifically: * "?" may not appear until the start of the one-and-only query segment allowed by 2396. * The same is true with "#" for fragments * Lastly, an UNMATCHED opening or closing paren must be escaped or else the cross-reference will not parse correctly. (Note that MATCHED opening and closing parens inside do not need escaping because they represent either a second-level nested cross-reference or a parenthetically enclosed string in a native URI but either way will not be ambiguous.) However we can't use just 2396 escaping rules (i.e., "%xx" where xx is the hex value of the character) for these four characters because it if the embedded XRI or URI contained those same escape sequences natively, it would be ambiguous which to unescape and not to unescape when the cross-reference is extracted. For example, if a URI already contained the escape sequence %3F (the "?" character) before it was turned into a cross-reference, how would the parser know NOT to unescape it when extracting the cross-reference? The proposed solution is to apply two special escaping rules to any URI/XRI string being embedded as a cross-reference. They must be applied in the following order: 1) Parse the string to find any instance of "%3F", "%23", "%28", or "%29" and enclose these in parens. Example: "(%3F)". 2) Escape any "?", "#", or UNMATCHED paren as per 2396 escaping. Example: a URI that contained a query would have the "?" turned into %3F. To extract a cross-reference, apply the same two rules in reverse, i.e.: 1) From the extracted cross-reference, escape any instance of %3F, %23, %28, or %29 EXCEPT those enclosed in parens. 2) For any instance of "%3F", "%23", "%28", or "%29" enclosed in parens, remove the parens. Example: Embed: URI "http://foo.com/?id=21#bar" in XRI "xri:@foo/baz/" Result: "xri:@foo/baz/(http://foo.com/%3Fid=21%23bar)" Embed: URI "http://foo.com/?id=21-%28widget#bar" in XRI "xri:@foo/baz/" Result: "xri:@foo/baz/(http://foo.com/%3Fid=21-(%28)widget%23bar)" Does this work for everyone? Any other proposals? =Drummond
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]