[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Subject: Re: [relax-ng] Encoding declaration, MIME type
> I certainly agree that we should use the media type when one is provided, > but what about something like a "file:" or "ftp:" URL, where there is no > media type? Some people believe that existing OSs should be revised so that they can provide the charset parameter. Gavin Nicol said that he wrote an I-D for this purpose long time ago. I have recently learned that an old OS from IBM does associate every file with character encoding information. > Here's a strawman proposal: > > 1. If you get the RNC as a MIME entity including information about the > charset, then use that charset. Note that text/plain without a charset > parameter is equivalent to "text/plain; charset=us-ascii". I am happy with this. By the way, if we stick to the HTTP RFC, the default is ISO-8859-1. I certainly think that this default is ridiculous. > 2. Otherwise, the RNC is in UTF-8 or UTF-16. If it has a UTF-16 BOM, it's > UTF-16. Otherwise it's UTF-8. I can live with this. By the way, which UTF-8? With or without the Unicode signature? Or, both? (Probably, both/) > 3. A system may provide a way to allow a user to specify an alternative > encoding for local files. Again, I can live with this. > 4. After converting the sequence of bytes to a sequence of characters, any > initial BOM is discarded. Including the Unicode signature for UTF-8? Probably, yes. Non-ascii users will probably say that we should provide some in-band encoding declarations. But I'm reluctant to do so. If we need a specialized media type for our compact syntax, I think that application/vnd.oasis-open.rng with the charset parameter is probably acceptable. Cheers, Makoto
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Powered by eList eXpress LLC