OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

docbook-apps message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: Re: [docbook-apps] character maps

Hi Hinrich,

> @2007-07-30 13:56 +0200:
> We have to replace " (ascii quotes) with unicode quotes.
> The documents are big and already written. Also using ascii for this is a
> lot easier for authoring. You don't want to copy and paste Unicode
> characters in the xml source all the time, while you have th ascii quotes on
> your keyboard.

An XSLT character map is not a good solution to that problem. What
you describe isn't a one-to-one character substitution; the
straight quote will need to be replaced by a left double quotation
mark in some contexts, a right one in some contexts, and be kept
as straight quotation mark in other contexts.

And it's not just that XSLT character maps aren't a good solution
for that problem; it's that XSLT isn't. You'd probably be a lot
better off running your source through some kind of pre-processor
first (a script written in Perl or sed or whatever) to do the
quote substitution, before running any XSLT transformation on it.

> Also we want to exchange the ascii - (dash) with a longer Unicode dash.

Unless you really want to replace such dashes in all contexts,
that's another case where you are probably not going to be doing a
one-to-one character replacement and thus probably need to do some
context-aware string replacement of the kind much more easily done
in Perl or something than in XSLT.


> -----Urspr¸«ängliche Nachricht-----
> Von: Michael(tm) Smith [mailto:smith@sideshowbarker.net] 
> Gesendet: Monday, July 30, 2007 1:22 PM
> An: Hinrich Aue
> Cc: docbook-apps@lists.oasis-open.org
> Betreff: Re: [docbook-apps] character maps
> Hinrich Aue <hinrich.aue@lci-software.com>, 2007-07-30 12:52 +0200:
> > Can somebody explain to me what a character map is?
> In relations to the DocBook XSL stylesheets, a character map is a
> feature of the manpages stylesheet that's used for converting
> Unicode characters to their nroff/groff equivalents.
> For more information, see the following:
> http://docbook.sourceforge.net/release/xsl/current/doc/manpages/man.charmap.
> enabled.html
> The character map used in the manpages stylesheet follows the
> format specified in the XSLT 2.0 recommendation:
>   http://www.w3.org/TR/xslt20/#character-maps
> > I'm trying to replace some ascii characters with Unicode charcters,
> Why? Would it not be easier to use the Unicode characters in your
> source?
> If that's really what you need to do, then a character map is
> probably not what you really need. The character-map feature is
> intended for doing the opposite of what you describe; that is, for
> converting Unicode characters to some corresponding strings of
> ASCII characters (generally, for use in output formats that don't
> support Unicode).
> > and I think I should use a character map.
> > 
> > Can somebody give me an example?
> I might be able to help, if you'd need to describe exactly what it
> is you need to do -- what output format you need to do this for,
> what characters you want to replace and what you want to replace
> them with.
> > Do I use man.string.subst.map for that?
> You could, if you need to do it for manpages output.
> > Btw, I use xsltproc and fop 0.9
> The fact that you mention fop makes me suspect what you are trying
> to do relates to FO output from the stylesheets, not manpages
> output. If that's the case, I remain confused about why you would
> want to replace ASCII characters with Unicode characters during
> XSLT processing, rather than just using the Unicode characters in
> your source to begin with.
>   --Mike
> -- 
> Michael(tm) Smith
> http://people.w3.org/mike/
> http://sideshowbarker.net/

Michael(tm) Smith


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]