OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

docbook message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [docbook] Question about prettyprinting Docbook documents and character entities


Taro Ikai <tikai@ABINITIO.COM> writes:

> I am having a few problems prettyprinting my Docbook documents. I am using 
> Cygwin distribution of Tidy.
> 
> 1) Tidy seems to translate the character entities:
> 
>  &ensp; into a two-byte sequence of 0x20, 0x02, and 
>  &emsp; into a two-byte sequence of 0x20, 0x03
> 
> Is this expected? I want to keep the &entityname; notations in the output. 
> How can I do this?

I don't think you can. For XML output, I think Tidy is hard-coded to
translate the entity names into numeric ones.

> 2) Tidy fails to produce any output with Japanese UTF-8 encoded documents.

I've seen the same thing with Cygwin Tidy -- no output for any UTF-8
encoded documents. I think its UTF-8 handling is just broken. But it
does seem to handle UTF-16 and Shift-JIS correctly.

PGP signature



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]