OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

uddi-spec message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: [no subject]



3.2.17.1 Lexical representation=20
The =B7lexical space=B7 of anyURI is finite-length character sequences whic=
h,=20
when the algorithm defined in Section 5.4 of [XML Linking Language] is=20
applied to them, result in strings which are legal URIs according to [RFC=20
2396], as amended by [RFC 2732].=20

from XLink:=20

Some characters are disallowed in URI references, even if they are allowed =

in XML; the disallowed characters include all non-ASCII characters, plus=20
the excluded characters listed in Section 2.4 of [IETF RFC 2396], except=20
for the number sign (#) and percent sign (%) and the square bracket=20
characters re-allowed in [IETF RFC 2732]. Disallowed characters must be=20
escaped as follows:=20
1.        Each disallowed character is converted to UTF-8 [IETF RFC 2279]=20
as one or more bytes.=20
2.        Any bytes corresponding to a disallowed character are escaped=20
with the URI escaping mechanism (that is, converted to %HH, where HH is=20
the hexadecimal notation of the byte value).=20
3.        The original character is replaced by the resulting character=20
sequence.=20

The result of the indirection in this definition is that some=20
implementations of anyURI accept only characters defined in RFC2732,=20
whereas others accept the Unicode characters that would result in valid=20
RFC2372 URIs if processed by the algorithm in XLink.

The XML Schema group has acknowledged that for I18N reasons, the schema=20
allows Unicode characters in anyURI and that for now clients should=20
transform them to access/invoke the resource.  I would like to know if the =

UDDI TC desires this flexibility as well.  If the UDDI TC desires that the =

client be able to specify Unicode without escaping the non-ASCII=20
characters, it may be benificial for short term interoperability of UDDI=20
implementations to change to the string datatype as is already the case=20
with access points.  Another option would be to place a post schema=20
assessment restriction that the publisher escape the URIs per RFC2372=20
prior to publication.

Hopefully we will have time to discuss this in the meeting.

Thanks.

Andrew Hately
IBM Austin
UDDI Development, Emerging Technologies=20
Lotus Notes: Andrew Hately/Austin/IBM@IBMUS
Internet: hately@us.ibm.com
(512) 838-2866,  t/l 678-2866

--=_alternative 007E57F306256E35_=
Content-Type: text/html; charset="ISO-8859-1"
Content-Transfer-Encoding: quoted-printable


<br><font size=3D2 face=3D"sans-serif">I would like to add a discussion of
the following issue to the meeting agenda.</font>
<br>
<br><font size=3D2 face=3D"sans-serif">In implementation of the V3 registry,
we have found that the discoveryURL and overviewURL could cause interoperab=
ility
issues due to the current state of schema assessment performed by various
implementations on the XML Schema datatype anyURI.</font>
<br>
<br><font size=3D2 face=3D"sans-serif">From the XML Schema specification, it
says it is lexically valid after applying the algorithm from XLink... which
could mean that certain characters outside of US-ASCII would be valid as
long as they can be encoded later per XLink section 5.4.</font><font size=
=3D3>
<br>
<br>
</font><font size=3D2><b><br>
3.2.17.1 Lexical representation</b></font><font size=3D3> </font>
<p><font size=3D3>The </font><a href=3D"http://www.w3.org/TR/xmlschema-2/#d=
t-lexical-space"><font size=3D3 color=3Dblue><b><u>=B7</u></b><u>lexical
space<b>=B7</u></b></font></a><font size=3D3> of <b>anyURI</b> is finite-le=
ngth
character sequences which, when the algorithm defined in Section 5.4 of
</font><a href=3D"http://www.w3.org/TR/xmlschema-2/#XLink";><font size=3D3 c=
olor=3D#0000cc><u>[XML
Linking Language]</u></font></a><font size=3D3> is applied to them, result
in strings which are legal URIs according to </font><a href=3D"http://www.w=
3.org/TR/xmlschema-2/#RFC2396"><font size=3D3 color=3D#0000cc><u>[RFC
2396]</u></font></a><font size=3D3>, as amended by </font><a href=3D"http:/=
/www.w3.org/TR/xmlschema-2/#RFC2732"><font size=3D3 color=3D#0000cc><u>[RFC
2732]</u></font></a><font size=3D3>. <br>
</font><font size=3D2 face=3D"sans-serif"><br>
from XLink:</font><font size=3D3> <br>
<br>
Some characters are disallowed in URI references, even if they are allowed
in XML; the disallowed characters include all non-ASCII characters, plus
the excluded characters listed in Section 2.4 of </font><a href=3D"http://w=
ww.w3.org/TR/2000/PR-xlink-20001220/#rfc2396"><font size=3D3 color=3D#0000c=
c><u>[IETF
RFC 2396]</u></font></a><font size=3D3>, except for the number sign (#) and
percent sign (%) and the square bracket characters re-allowed in </font><a =
href=3D"http://www.w3.org/TR/2000/PR-xlink-20001220/#rfc2732";><font size=3D=
3 color=3D#0000cc><u>[IETF
RFC 2732]</u></font></a><font size=3D3>. Disallowed characters </font><a hr=
ef=3D"http://www.w3.org/TR/2000/PR-xlink-20001220/#dt-must";><font size=3D3 =
color=3D#0000cc><u>must</u></font></a><font size=3D3>
be escaped as follows: </font><font size=3D2 face=3D"sans-serif"><br>
1. &nbsp; &nbsp; &nbsp; &nbsp;</font><font size=3D3>Each disallowed charact=
er
is converted to UTF-8 </font><a href=3D"http://www.w3.org/TR/2000/PR-xlink-=
20001220/#rfc2279"><font size=3D3 color=3D#0000cc><u>[IETF
RFC 2279]</u></font></a><font size=3D3> as one or more bytes. </font><font =
size=3D2 face=3D"sans-serif"><br>
2. &nbsp; &nbsp; &nbsp; &nbsp;</font><font size=3D3>Any bytes corresponding
to a disallowed character are escaped with the URI escaping mechanism (that
is, converted to </font><font size=3D3><tt>%</tt></font><font size=3D3><i>H=
H</i>,
where HH is the hexadecimal notation of the byte value). </font><font size=
=3D2 face=3D"sans-serif"><br>
3. &nbsp; &nbsp; &nbsp; &nbsp;</font><font size=3D3>The original character
is replaced by the resulting character sequence. <br>
</font>
<br><font size=3D2 face=3D"sans-serif">The result of the indirection in this
definition is that some implementations of anyURI accept only characters
defined in RFC2732, whereas others accept the Unicode characters that would
result in valid RFC2372 URIs if processed by the algorithm in XLink.</font>
<br>
<br><font size=3D2 face=3D"sans-serif">The XML Schema group has acknowledged
that for I18N reasons, the schema allows Unicode characters in anyURI and
that for now clients should transform them to access/invoke the resource.
&nbsp;I would like to know if the UDDI TC desires this flexibility as well.
&nbsp;If the UDDI TC desires that the client be able to specify Unicode
without escaping the non-ASCII characters, it may be benificial for short
term interoperability of UDDI implementations to change to the string datat=
ype
as is already the case with access points. &nbsp;Another option would be
to place a post schema assessment restriction that the publisher escape
the URIs per RFC2372 prior to publication.</font>
<br>
<br><font size=3D2 face=3D"sans-serif">Hopefully we will have time to discu=
ss
this in the meeting.</font>
<br>
<br><font size=3D2 face=3D"sans-serif">Thanks.<br>
<br>
Andrew Hately<br>
IBM Austin<br>
UDDI Development, Emerging Technologies <br>
Lotus Notes: Andrew Hately/Austin/IBM@IBMUS<br>
Internet: hately@us.ibm.com<br>
(512) 838-2866, &nbsp;t/l 678-2866<br>
</font>
--=_alternative 007E57F306256E35_=--


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]