OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

relax-ng message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]

Subject: RE: [relax-ng] Compact syntax encoding declaration

I like option C best. But couldn't we just do something like:

!xml encoding="iso-8859-1"


?xml encoding="iso-8859-1"

which would produce a text declaration like

<?xml encoding="iso-8859-1"?>

or perhaps an XML declaration could be derived from

!xml version="1.0" encoding="iso-8859-1"

giving us

<?xml version="1.0" encoding="iso-8859-1"?>

Of course it would be optional but if it appears, it must be at line 1,
column 1.

Doing an XML decl would allow us to control the version of the serialized
XML form.

Would we need to support a standalone declaration if we did this?


-----Original Message-----
From: James Clark [mailto:jjc@jclark.com]
Sent: Monday, April 08, 2002 5:36 AM
To: relax-ng@lists.oasis-open.org
Subject: [relax-ng] Compact syntax encoding declaration

Here are some possible approaches to issue 3.

A. In XML, the encoding declaration adds a non-trivial amount complexity
both for the implementation and specification. Is there some way we can
avoid this complexity in RNC? What if we said something like this:

1. The encoding of an RNC entity may be externally specified (e.g. by a
MIME header or a command line option).  In this case, an RNC entity can use
any encoding and it may but need not start with a BOM.

2. If the encoding of an RNC entity is not externally specified, then the
entity must use either UTF-16 or UTF-8.  If it uses UTF-16, it must start
with a BOM. If it uses UTF-8, it may but need not start with a BOM.

B. If A's support for legacy encodings is deemed inadequate, could we
piggyback on top of XML's support by defining a simple one element wrapper
for RNC?  That is, if you don't want to use UTF-8 or UTF-16, then you must
wrap your RNC in a single XML element and use an XML encoding declaration,

<?xml version="1.0" encoding="iso-8859-1"?>
<rnc ns="http://relaxng.org/ns/compact/1.0";><![CDATA[

C. If B is still not enough, then we need our own encoding declaration. If
we do this, I would like to use a similar autodetection algorithm to XML.
This requires that the entity start with a known sequence of characters.
For this reason, I would prefer a syntax that has a different "feel" to the
"namespace" and "datatypes" declaration.  If the encoding declaration feels
like the "namespace" and "datatypes" declaration, people will assume they
can put comments and whitespace before it.  I would suggest a syntax along
the lines of

!rnc encoding="iso-8859-1"

The encoding declaration would optional as in XML.


To subscribe or unsubscribe from this elist use the subscription
manager: <http://lists.oasis-open.org/ob/adm.pl>

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]

Powered by eList eXpress LLC