[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Subject: RE: [relax-ng] Compact syntax encoding declaration
I like option C best. But couldn't we just do something like: !xml encoding="iso-8859-1" or ?xml encoding="iso-8859-1" which would produce a text declaration like <?xml encoding="iso-8859-1"?> or perhaps an XML declaration could be derived from !xml version="1.0" encoding="iso-8859-1" giving us <?xml version="1.0" encoding="iso-8859-1"?> Of course it would be optional but if it appears, it must be at line 1, column 1. Doing an XML decl would allow us to control the version of the serialized XML form. Would we need to support a standalone declaration if we did this? Mike -----Original Message----- From: James Clark [mailto:jjc@jclark.com] Sent: Monday, April 08, 2002 5:36 AM To: relax-ng@lists.oasis-open.org Subject: [relax-ng] Compact syntax encoding declaration Here are some possible approaches to issue 3. A. In XML, the encoding declaration adds a non-trivial amount complexity both for the implementation and specification. Is there some way we can avoid this complexity in RNC? What if we said something like this: 1. The encoding of an RNC entity may be externally specified (e.g. by a MIME header or a command line option). In this case, an RNC entity can use any encoding and it may but need not start with a BOM. 2. If the encoding of an RNC entity is not externally specified, then the entity must use either UTF-16 or UTF-8. If it uses UTF-16, it must start with a BOM. If it uses UTF-8, it may but need not start with a BOM. B. If A's support for legacy encodings is deemed inadequate, could we piggyback on top of XML's support by defining a simple one element wrapper for RNC? That is, if you don't want to use UTF-8 or UTF-16, then you must wrap your RNC in a single XML element and use an XML encoding declaration, e.g. <?xml version="1.0" encoding="iso-8859-1"?> <rnc ns="http://relaxng.org/ns/compact/1.0"><![CDATA[ ... ]]></rnc> C. If B is still not enough, then we need our own encoding declaration. If we do this, I would like to use a similar autodetection algorithm to XML. This requires that the entity start with a known sequence of characters. For this reason, I would prefer a syntax that has a different "feel" to the "namespace" and "datatypes" declaration. If the encoding declaration feels like the "namespace" and "datatypes" declaration, people will assume they can put comments and whitespace before it. I would suggest a syntax along the lines of !rnc encoding="iso-8859-1" The encoding declaration would optional as in XML. James ---------------------------------------------------------------- To subscribe or unsubscribe from this elist use the subscription manager: <http://lists.oasis-open.org/ob/adm.pl>
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Powered by eList eXpress LLC