[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Subject: minutes from ER TC 20010611
Present: David, Lauren, Paul, Norm, John, Tony Normalizing system IDs: open question is what do we do about normalization of characters; there are some which are not allowed in URIs by RFC 2396. There are reserved characters and unwise characters. What happens if I put a reserved character in a URI? (E.g., $) It can appear in an attribute value in an XML document. In the system ID in the DOCTYPE, they may or may not %-escape the $. We may have a $ in the catalog; the user may decide they should %-escape it in some place or another. If they don't %-escape it in both places, should it match anyway? If we say no, there is a potentially interoperability issue if one parser does the escaping for you. The catalog then needs at least 2 different entries to match both the escaped and unescaped versions. The unwise characters are even worse. What should we do? Norm's first proposal: Catalog processors must reduce the %-escaped characters to the equivalent octet before comparison. This is better than turning octets into %-escaped characters. These characters are just for comparison, not for sending over the wire. We are only looking for the first match. One consequence would be that there are problems if there is a % in there but it isn't a %-escaped character. There may be 2 URIs which differ in that one has an escaped character (e.g. for a slash) which should not be turned back into the decoded character because the difference in semantics means they are different URIs. So this proposal doesn't work. Characters which are not allowed to appear in URIs must be %-escaped. Unreserved characters can be escaped but don't have to be. When presented with a system ID or URI reference, we should %-escape any character which is not unreserved. XML 1.0 2nd ed. does specify some of this; we need to decide whether the catalog processor is part of the XML processor or the application. If this escaping has been done once, what happens if it's done again? The % character is not %-escaped in section 4.2.2 in XML 1.0 2nd ed. So this escaping would basically be a no-op. The XML processor must escape disallowed characters. The XML processor doesn't know what attribute values are URIs so won't %-escape everything anyway. So one proposal is that the catalog processor should apply the same %-escaping as in XML to all URIs. After the %-encoding, we treat things as strings. A system ID can't have a fragment ID. In our URI matching of IDs, we currently say they are URI references, which may have fragment identifiers. So we really just take them as strings. If there are a lot of URI references to the same document, then you need an entry in the catalog for each fragment ID in use. Consensus on the fact that the catalog processor does the %-escaping, as per XML 1.0 2nd ed. Do we need a mechanism to match URIs starting with foo to URIs starting with bar? Or a mechanism to make the processor aware of the #? Proposal: new type of catalog entry which is used during URI lookup. It maps URIs beginning with prefix1 to URIs beginning with prefix2, keeping the suffixes the same. Useful for mirror sites, fragment IDs, mapping absolute URIs to relative URIs. Useful for system IDs; nobody wants it for public IDs. We need a name for this: rewrite (comes from Apache). Precedence should be after direct match and before delegate. No objections to the name or the precedence. June 8th spec has the fixes for URI and URI reference. Is a catalog allowed by XML 1.0 2nd ed? We think so. Norm will get the draft out this week. We will try next week to flush out the last remaining issues. Pay particular attention to the URI/URI reference wordings. Do we ever talk of URIs? Only in baseURI. Other than that, they are URI references. Lauren ----------- Lauren Wood, Director of Product Technology, SoftQuad Software Chair, XML 2001 - Call for presentations now open at www.xmlconference.org
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Powered by eList eXpress LLC