[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Subject: Re: [relax-ng] Re: RFC2518 (WebDAV) / RFC2396 (URI) inconsistency
> That means the number of people who are actually using broken URIs can > be less than what XMLSpy claims. So this was an argument for not > removing the check for namespace URIs. Right? Right. I believe the things you can check about URIs are: 1. It uses % properly, i.e. every % character is followed by two hex digits 2. Either (a) It's a relative URI (i.e. if there's a colon, then there must be a / or ? or # somewhere before the first colon), or (b) (i) It starts with a legal URI schema name (i.e. [A-Za-z][A-Za-z0-9+\.\-]*:) and (ii) the scheme-specific part is non-empty 3. It contains at most one # character We can express each of these in turns of regexps: 1. ([^%]|%[a-fA-F0-9][a-fA-F0-9])* 2. (a) [^:]*([#/?].*)? (b) [a-zA-Z][\-+\.a-zA-Z0-9]*:.+ 3. [^#]*(#[^#]*)? Combining these is a little tricky. We can combine 2(a) and 3 into: [^:]*([/?][^#]*)?(#[^#]*)? we can then combine that with 1 to get: ([^%:]|%[a-fA-F0-9][a-fA-F0-9])*([/?]([^%#]|%[a-fA-F0-9][a-fA-F0-9])*)?(#([ ^%#]|%[a-fA-F0-9][a-fA-F0-9])*)? Similarly, we can combine 2(b) and 3 to get [a-zA-Z][\-+\.a-zA-Z0-9]*:((#[^#]*)|[^#]+(#[^#]*)?) We can combine that with 1 to get [a-zA-Z][\-+\.a-zA-Z0-9]*:((#([^%#]|%[a-fA-F0-9][a-fA-F0-9])*)|([^%#]|%[a-f A-F0-9][a-fA-F0-9])+(#([^%#]|%[a-fA-F0-9][a-fA-F0-9])*)?) So putting the two alternatives together we get: (([^%:]|%[a-fA-F0-9][a-fA-F0-9])*([/?]([^%#]|%[a-fA-F0-9][a-fA-F0-9])*)?(#( [^%#]|%[a-fA-F0-9][a-fA-F0-9])*)?)|([a-zA-Z][\-+\.a-zA-Z0-9]*:((#([^%#]|%[a -fA-F0-9][a-fA-F0-9])*)|([^%#]|%[a-fA-F0-9][a-fA-F0-9])+(#([^%#]|%[a-fA-F0- 9][a-fA-F0-9])*)?)) I haven't tested this, so it may well be buggy, but it has very little in common with the XMLSpy regexp. The only thing they seem to agree on is that you can have at most one # character. One thing this does illustrate is that it is hard for an implementor to figure out from the specs what you are supposed to check. It has taken several years for people to realize that DAV is using a syntactically incorrect URI. James
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Powered by eList eXpress LLC