Subject: [relax-ng] Compact syntax encoding declaration

Here are some possible approaches to issue 3.

A. In XML, the encoding declaration adds a non-trivial amount complexity 
both for the implementation and specification. Is there some way we can 
avoid this complexity in RNC? What if we said something like this:

1. The encoding of an RNC entity may be externally specified (e.g. by a 
MIME header or a command line option).  In this case, an RNC entity can use 
any encoding and it may but need not start with a BOM.

2. If the encoding of an RNC entity is not externally specified, then the 
entity must use either UTF-16 or UTF-8.  If it uses UTF-16, it must start 
with a BOM. If it uses UTF-8, it may but need not start with a BOM.

B. If A's support for legacy encodings is deemed inadequate, could we 
piggyback on top of XML's support by defining a simple one element wrapper 
for RNC?  That is, if you don't want to use UTF-8 or UTF-16, then you must 
wrap your RNC in a single XML element and use an XML encoding declaration, 

<?xml version="1.0" encoding="iso-8859-1"?>
<rnc ns="http://relaxng.org/ns/compact/1.0";><![CDATA[

C. If B is still not enough, then we need our own encoding declaration. If 
we do this, I would like to use a similar autodetection algorithm to XML. 
This requires that the entity start with a known sequence of characters. 
For this reason, I would prefer a syntax that has a different "feel" to the 
"namespace" and "datatypes" declaration.  If the encoding declaration feels 
like the "namespace" and "datatypes" declaration, people will assume they 
can put comments and whitespace before it.  I would suggest a syntax along 
the lines of

!rnc encoding="iso-8859-1"

The encoding declaration would optional as in XML.


