[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Subject: Re: DOCBOOK-APPS: Bad Continuation of Multi-Byte UTF-8 Sequence
At 01:56 PM 6/24/01, Michael Westbay wrote: >To Walsh's comment: > >> >Encoding can be specified by this way for external parsed entities, >> >version pseudoattribute is optional - moreover some XML processors are >> >unable to process external entity if it contains version information in >> >its declaration. > >Pawson-san wrote: > >> Surely this is a weakness in the XML spec then? I'm stuffed if I need >> an external parsed entity in a different encoding? > >While the encoding is part of the specification, it's optional to support >multiple encodings. Saxon, for example, only supports UTF-8, USASCII, and >ISO-8859-1 (all of which are exact subsets of UTF-8). Agreed, though i18n suggests the net is moving away from only speaking Western encodings. I referred to just such a case as yours, i.e. taking in file b with encoding X into a file a with encoding Y. >You must not deal with languages that have multiple encodings. Rather a sweeping statement? You state a reasonable use case below, using individual files, I can see the day when the encoding will need to change within a single file. > The reason I >prefer to use Xalan/Xerces over Saxon is this every issue, the Apache XML/XSL >tools allow the encoding to be specified on a per document basis. The loss >is speed is made up for in versitility. >What this function allows me to do is take a document produced by one >engineer on a Windows box in Shift_JIS, then process it with an XSL(T) on my >FreeBSD box that is encoded in EUC-JP. (For HTML, I often have the output >encoding set in the XSL to be ISO-2022-JP.) Fair judgement, with the case you state. I'm presuming that multiple encoding fragments will become a norm rather than an exception. I guess processors will gradually align as code becomes more available. >Where i18n and l10n is concerned, this is a strength in the XML spec, not a >weekness. I referred to the statement that Norm corrected. REgards DaveP
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Powered by eList eXpress LLC