[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Subject: [relax-ng] Compact syntax encoding declaration
Here are some possible approaches to issue 3. A. In XML, the encoding declaration adds a non-trivial amount complexity both for the implementation and specification. Is there some way we can avoid this complexity in RNC? What if we said something like this: 1. The encoding of an RNC entity may be externally specified (e.g. by a MIME header or a command line option). In this case, an RNC entity can use any encoding and it may but need not start with a BOM. 2. If the encoding of an RNC entity is not externally specified, then the entity must use either UTF-16 or UTF-8. If it uses UTF-16, it must start with a BOM. If it uses UTF-8, it may but need not start with a BOM. B. If A's support for legacy encodings is deemed inadequate, could we piggyback on top of XML's support by defining a simple one element wrapper for RNC? That is, if you don't want to use UTF-8 or UTF-16, then you must wrap your RNC in a single XML element and use an XML encoding declaration, e.g. <?xml version="1.0" encoding="iso-8859-1"?> <rnc ns="http://relaxng.org/ns/compact/1.0"><![CDATA[ ... ]]></rnc> C. If B is still not enough, then we need our own encoding declaration. If we do this, I would like to use a similar autodetection algorithm to XML. This requires that the entity start with a known sequence of characters. For this reason, I would prefer a syntax that has a different "feel" to the "namespace" and "datatypes" declaration. If the encoding declaration feels like the "namespace" and "datatypes" declaration, people will assume they can put comments and whitespace before it. I would suggest a syntax along the lines of !rnc encoding="iso-8859-1" The encoding declaration would optional as in XML. James
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Powered by eList eXpress LLC