OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

docbook-apps message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: Re: [docbook-apps] Converting Symbol Fonts to UTF-8

Bob Stayton wrote:
> Thanks, that clarifies the situation.  This seems to be a two-byte 
> encoding, perhaps specific to Microsoft's Symbol font?  I thought the 
> Symbol font was single byte, so I'm not understanding those numbers.  
> Anybody else recognize this?

There is something of an explanation here:
To further complicate the picture, there are two different ways to 
encode 8-bit fonts: as normal text fonts, called UGL, or as symbol 
fonts. Most fonts containing alphabetic characters (e.g., Times New 
Roman, Arial) are encoded as UGL fonts. Fonts containing symbols (e.g. 
Wingdings) are typically encoded as symbol fonts. Word 97/2000 uses two 
different translation schemes between Unicode values and 8-bit values, 
depending on whether the font used for the text in question is a UGL 
font or a symbol font. If the font is a UGL font, Word 97/2000 converts 
the characters between the standard 8-bit and Unicode values defined by 
the active codepage. No such standard conversion exists for symbols, 
however, so if the font is a symbol font, Word 97/2000 converts the 
characters to a different set of Unicode values in what is called the 
“Private Use Area” (PUA) of Unicode.
There's a PDF link at the bottom of the page which goes into more details.
    Mike Maxwell
    What good is a universe without somebody around to look at it?
    --Robert Dicke, Princeton physicist

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]