OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

xri-editors message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Xref escaping rules


Here's the escaping proposal that I brought up on the last TC call.
 
The issue is that an XRI or URI is used as a cross-reference may contain
characters that must be escaped in order to conform with 2396.
Specifically:
 
* "?" may not appear until the start of the one-and-only query segment
allowed by 2396.
* The same is true with "#" for fragments
* Lastly, an UNMATCHED opening or closing paren must be escaped or else
the cross-reference will not parse correctly. (Note that MATCHED opening
and closing parens inside do not need escaping because they represent
either a second-level nested cross-reference or a parenthetically
enclosed string in a native URI but either way will not be ambiguous.)
 
However we can't use just 2396 escaping rules (i.e., "%xx" where xx is
the hex value of the character) for these four characters because it if
the embedded XRI or URI contained those same escape sequences natively,
it would be ambiguous which to unescape and not to unescape when the
cross-reference is extracted. For example, if a URI already contained
the escape sequence %3F (the "?" character) before it was turned into a
cross-reference, how would the parser know NOT to unescape it when
extracting the cross-reference?
 
The proposed solution is to apply two special escaping rules to any
URI/XRI string being embedded as a cross-reference. They must be applied
in the following order:
 
1) Parse the string to find any instance of "%3F", "%23", "%28", or
"%29" and enclose these in parens. Example: "(%3F)".
2) Escape any "?", "#", or UNMATCHED paren as per 2396 escaping.
Example: a URI that contained a query would have the "?" turned into
%3F.
 
To extract a cross-reference, apply the same two rules in reverse, i.e.:
 
1) From the extracted cross-reference, escape any instance of %3F, %23,
%28, or %29 EXCEPT those enclosed in parens.
2) For any instance of "%3F", "%23", "%28", or "%29" enclosed in parens,
remove the parens.
 
Example: 
 
Embed: URI "http://foo.com/?id=21#bar" in XRI "xri:@foo/baz/"
Result: "xri:@foo/baz/(http://foo.com/%3Fid=21%23bar)"
 
 
Embed: URI "http://foo.com/?id=21-%28widget#bar" in XRI "xri:@foo/baz/"
Result: "xri:@foo/baz/(http://foo.com/%3Fid=21-(%28)widget%23bar)"
 
Does this work for everyone? Any other proposals?
 
=Drummond 
 
 
 
 


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]