OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

docbook-apps message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [docbook-apps] Strange space before title text in HTML


Well, this is interesting.  When xsltproc outputs a non-breaking space 
character (#160 or hex #xA0) when the output encoding is set to iso-8859-1, 
then it outputs the character Unicode #xFFFD, which is described in the 
Unicode standard as "Replacement character, used to replace incoming 
characters whose values are unknown or unrepresentable in Unicode".

This happens with older versions of xsltproc and older versions of the 
stylesheets too.  I can also make it happen with Saxon 6.5.3 if I use 
method="xml" and the Saxon output character representation extension to 
output native characters instead of entities:

<xsl:output encoding="ISO-8859-1"
            method="xml"
            saxon:character-representation="native;decimal"/>

If the output encoding is changed to utf-8 in <xsl:output>, then xsltproc 
and Saxon output #xA0.

But #xA0 is a native character in iso-8859-1, isn't it?  My iso-8859-1 
reference says so. Why do both processors not output it?

Bob Stayton
Sagehill Enterprises
DocBook Consulting
bobs@sagehill.net


----- Original Message ----- 
From: "Steinar Bang" <sb@dod.no>
To: <docbook-apps@lists.oasis-open.org>
Sent: Thursday, June 29, 2006 10:30 PM
Subject: [docbook-apps] Strange space before title text in HTML


> Platform: Intel Pentium M, Ubuntu 6.06 Dapper Drake,
>   docbook-xml 4.4-4
>   docbook-xsl 1.70.1
>   xsltproc 1.1.15-1ubuntu1
>
> When I create HTML files from DocBook, using the above configuration,
> Opera 9 on linux, displays a question mark in a diamond between
> section numbers and the section title.  IE 6 on XP Pro, displays a
> question mark.
>
> When I do `C-x =' on that character in Emacs 21.4, it says:
> Char: (04240, 2208, 0x8a0, file A0)
> which I don't quite know to interpret.
>
> This looks like a unicode character, but the coding system of the
> buffer is iso-latin-1-unix.  In ISO-8859-1, A0 would be non breaking
> space.  But appearently (according to Opera), this isn't it...?
>
> Doing `C-x =' on the space between the section number and the title in
> the TOC, gives:
> Char: SPC (040, 32, 0x20)
>
> Does anyone know how to fix this?
>
> I've attached an HTML file displaying the problem, as well as the
> original XML file it was generated from.  I hope they survive.  They
> are gzipped individually.  I haven't zip'ed or tar'ed them to avoid
> being stripped on the way.
>
> Thanx!
>
>
> - Steinar
>
>
>


--------------------------------------------------------------------------------


> ---------------------------------------------------------------------
> To unsubscribe, e-mail: docbook-apps-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: docbook-apps-help@lists.oasis-open.org 




[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]