OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

docbook-apps message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: Re: [docbook-apps] Different encoding in XML and XSL.

In general, you can mix and match encodings without problems, because XML
processors convert whatever the original encoding was to Unicode internally.
That's why it is critically important that all XML documents indicate their
encoding (or they are taken to be UTF-8 by default). Once loaded as Unicode
in memory, the processor can write it out to whichever encoding you ask for.

The big caveat is that not all processors support conversion of all
encodings.  For example, according to its doc, the built-in AElfred parser
in Saxon 6.5 supports these incoming encodings: ISO-8859-1, 8859_1,
ISO8859_1, US-ASCII,  ASCII, UTF-8, UTF8,ISO-10646-UCS-2, UTF-16, UTF-16BE,

and it supports these outgoing encodings:

ascii, us-ascii, utf-8, utf8, utf-16, utf16, iso-8859-1, iso-8859-2
ko18-r, cp852, cp1250, windows-1250, cp1251, windows-1251

However,  if you substitute the Xerces parser in Saxon, you get a much
longer list of encodings:

The following link will give you some general background on encodings with
regard to DocBook:

Bob Stayton
Sagehill Enterprises
DocBook Consulting

----- Original Message ----- 
From: "Rajal Shah" <rajal@meshsoftware.com>
To: "Docbook-Apps" <docbook-apps@lists.oasis-open.org>
Sent: Tuesday, March 09, 2004 3:10 PM
Subject: [docbook-apps] Different encoding in XML and XSL.

> This may a generic XSL question.. But I've hit upon it when evaluating
> docbook xsls.. So I'm posting it here..
> I'm evaluating if docbook can fit our needs here.. We probably will have
> our custom XSL which would include/import docbook xsls. The input XML to
> xsl can have varying encodings (charset).. So the question is:
> 1. How does the docbook xsl behave if the XML encoding is different from
> XSL..
> 2. I also see the localization xml files (en.xml) in the docbook-xsl
> distribution.. The encoding is set to US-ASCII.. So in effect, I could
> my XML document coming in as "windows-1252", the en.xml file would have
> encoding set to "US-ASCII" and my xsl will most likely be "UTF-8". How is
> the behavior determined in this case..
> The general question is, if someone could point to understand the XML/XSL
> processor behavior in handling various encodings, that would be immensely
> appreciated.
> Regards.
> --
> Rajal
> To unsubscribe from this list, send a post to
docbook-apps-unsubscribe@lists.oasis-open.org, or visit

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]