OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

xri message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: RE: [xri] XRI Escaping Tool


OK, couple of things.

First of all, this escaping mechanism expects absolute XRIs. It will add the xri:// if you omit it.
 
Second of all, you have an *illegal* space in the authority.
 
Third of all, I fixed the underlying problem (I am not working off a spec for encoding unicode in HTML forms - mostly because I don't know where such would exist, so I am working on trial and error. I think I have it now). I'd be *REALLY* curious to see if anyone disagrees with me on drummond's example (a unicode character that appears to be in the lower 256 but not in the lower 128 unicode characters, so it gets encoded by browsers with a different encoding scheme that is based on ASCII, not unicode). In short, is "U+00E7" supposed to be encoded as " %C3%A7" (as I understand it according to unicode & utf-8), or as %E7, as browsers do escaping (which is NOT utf-8)?
 
Once again, THANKS.
 
    -Gabe


From: Drummond Reed [mailto:drummond.reed@cordance.net]
Sent: Friday, February 18, 2005 6:24 PM
To: Wachob, Gabe; xri@lists.oasis-open.org
Subject: RE: [xri] XRI Escaping Tool

Cool!

 

I tried it on the string "@ a-la-française" and got the error below.

 

=Drummond

 

 


 
UnicodeDecodeError

Python 2.3.3: /usr/local/bin/python
Fri Feb 18 21:21:55 2005

A problem occurred in a Python script. Here is the sequence of function calls leading up to the error, in the order they occurred.

 /usr/www/users/gwachob/xriescape/escape.cgi

   24         punycode=False

   25 

   26 xri=unicode(form["xri"].value, 'utf-8')

   27 xri=re.sub("&#x?\d+;", convertref,xri)

   28 print "Content-Type: text/html;charset=utf-8\n\n"

xri undefined, builtin unicode = <type 'unicode'>, form = FieldStorage(None, None, [MiniFieldStorage('xri'...e'), MiniFieldStorage('Submit', 'Submit Query')]), ].value = [MiniFieldStorage('xri', '@a-la-fran\xe7aise'), MiniFieldStorage('Submit', 'Submit Query')]

UnicodeDecodeError: 'utf8' codec can't decode bytes in position 10-12: invalid data
      args = ('utf8', '@a-la-fran\xe7aise', 10, 13, 'invalid data')
      encoding = 'utf8'
      end = 13
      object = '@a-la-fran\xe7aise'
      reason = 'invalid data'
      start = 10

 

 


From: Wachob, Gabe [mailto:gwachob@visa.com]
Sent: Friday, February 18, 2005 4:15 PM
To: xri@lists.oasis-open.org
Subject: [xri] XRI Escaping Tool

 

For my own understanding, and as a service to the XRI community, I have written a tool to implement what I understand to be the XRI to URI transformation rules. If you input an XRI with unicode (or just ascii) characters, it will display the IRI-normal and URI-normal forms. This obviously requires a utf-capable OS & browser, which is most of them these days. Tested on firefox 1.0 and IE6.

 

 

Please do try it. I've implemented this using the python unicode facilities and the escaping rules as I interpret them. I would like everyone to try this out themselves and please report back a) if the results are what you expect and b) if you see any hiccups.

 

I think one of the most complicated facets of XRI syntax is the escaping rules, but they are critical for interoperability, so please do hammer away.

 

    -Gabe

 

P.S. I will, of course, publish source code once I get a feeling that its producing results that seem to be agreeable with everyone.


__________________________________________________
gwachob@visa.com
Chief Systems Architect
Technology Strategies and Standards
Visa International
Phone: +1.650.432.3696   Fax: +1.650.554.6817

 



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]