OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

docbook-apps message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: Re: [docbook-apps] Chinese Index

Hi Maxime,

Regarding the xml:lang, in its current form, the java code requires that it must match the value of the <national_language> element in the botb_index_rules.xml file under i18n_support, as I documented in http://www.sagehill.net/docbookxsl/IndexIntl.html#KimberIndexMethod .  I never figured out how to make that more flexible.  In the DocBook gentext templates, the @xml:lang is always converted with translate() to lowercase with underscore before comparing to the @language attribute in the gentext files.  That would be prefereable for this toolkit too.

I also noticed that the ZH_CN language is commented out in botb_index_rules.xml, and I don't know why.  I am in correspondence with Eliot Kimber who authored this toolkit.  I'll post an update when I get it, so we don't have to keep using this older version.

Eliot is no longer with Innodata Isogen, so they no longer host that content.  That's why I wanted it in a more permanent place like the DocBook Wiki.

Bob Stayton
Sagehill Enterprises
On 6/4/2018 2:24 AM, Maxime Bégnis wrote:

Hello Bob,

I made it work with your latest post of i18n_support.zip.

I noticed that the xml:lang of the document must be declared uppercase. "ZH_TW" will work, "zh_tw" will not.

I had the same issue about the classpath order as I also use xercesImpl.jar. That's because i18n_support.jar declares a classpath in its manifest including "lib/xercesImpl.jar" making a conflict. Modifying the manifest removing the xercesImpl reference made it work here.

I also noticed that http://www.innodata-isogen.com/ is inaccessible.

Many thanks,

Maxime Bégnis

Le 02/06/2018 à 22:32, Bob Stayton a écrit :

OK, I managed to test the version of i18n_support.zip that I posted yesterday to the wiki, and it works. 

I did discover one quirk, though.  I usually use the Xerces parser with Saxon, but if I put xercesImpl.jar ahead of the i18n_support.jar, it does not work.  Putting it after does work.  Not sure why.

Bob Stayton
Sagehill Enterprises
On 6/1/2018 10:50 AM, Bob Stayton wrote:


Indeed, the .jar file in the zip file I originally posted appears to be incomplete.  I just posted to that link an older version of i18n_support whose jar file has that class.  Download and give that a try.  Again, I haven't had time to set this version up and test it, but I'm pretty sure it worked when I used it before.

Bob Stayton
Sagehill Enterprises
On 6/1/2018 5:28 AM, Maxime Bégnis wrote:

Hi Bob,

I could not make it work:

in autoidx-kimber.xsl there is a reference to the Java class com.isogen.saxoni18n.Saxoni18nService:


but that class is missing from the archive given in the page from the DocBook wiki. i18n_support.jar does not contain it. I found some references to it in the sources but it's not there.

So the stylesheet terminates with with the xsl:message on line 69 (docbook-xsl-ns-1.75.2)

Le 31/05/2018 à 23:03, Bob Stayton a écrit :

I'm pleased to announce the availability on the DocBook Wiki of Eliot Kimber's open-source Java toolkit for internationalized back-of-book indexes.  It can handle sorting and collation of all languages, including Asian languages like Chinese (both alphabets), Japanese, and Korean.  You can download it from this page:


He includes complete documentation in the zip file, and my XSL book has a quick start guide for using it with DocBook (the links in my book are out of date, but the Wiki page is up to date):


I have used this toolkit in the past and it works.  You may need to do some configuring to get it to work, but the results are worth it if you are doing Asian language indexes.

Bob Stayton
Sagehill Enterprises
On 5/30/2018 2:13 AM, Maxime Bégnis wrote:


I'm using in my customization layer (docbook-xsl-ns-1.75.2):

<xsl:import href="" class="moz-txt-link-rfc2396E" href="http://docbook.sourceforge.net/release/xsl-ns/current/fo/profile-docbook.xsl" moz-do-not-send="true">"http://docbook.sourceforge.net/release/xsl-ns/current/fo/profile-docbook.xsl" />
<xsl:import href="" class="moz-txt-link-rfc2396E" href="http://docbook.sourceforge.net/release/xsl-ns/current/fo/autoidx-kosek.xsl" moz-do-not-send="true">"http://docbook.sourceforge.net/release/xsl-ns/current/fo/autoidx-kosek.xsl" />

<xsl:param name="index.method">kosek</xsl:param>

For a Russian language document, the index sorting by letter works very well. For Chinese there is no sorting.

If I modify my custom l10n for Chinese to put in the letters list a few  random Chinese symbols, it works fine for those:

<l:l10n language="zh" english-language-name="Chinese">
      <l:l i="-1"/>
      <l:l i="0">符号</l:l>
      <l:l i="1">不</l:l>
      <l:l i="1">不</l:l>
      <l:l i="2">危</l:l>
      <l:l i="2">危</l:l>
      <l:l i="3">C</l:l>
      <l:l i="3">c</l:l>
      <l:l i="4">D</l:l>
      <l:l i="4">d</l:l>


Does someone have a pointer to a list of the Chinese symbols that should be used in an index?

Thank you very much for any help,

Maxime Bégnis
Tél: +33 (0)
789 Rue de La Gare
13770 Venelles

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]