OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

docbook-apps message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [docbook-apps] Generating indices for Japanese and Chinese


You did not mention whether you are after Traditional or Simplified
Chinese (they use different collation rules for back of the book
indexes), However:

Asian languages have difficult indexing issues due to the number of
characters in the languages (tens of thousands versus tens for European
and English).  The problems are exacerbated by the fact that not all
Asian languages agree with the Unicode sort as the ordering principal
for their indexes.

The best resources I have seen on the approaches to the problems of
indexing Asian languages are:

  http://www.idealliance.org/proceedings/xml04/papers/77/xslindex.html

and

http://www.mulberrytech.com/Extreme/Proceedings/html/2002/Kimber01/EML2002Kimber01.html

The problem is non-trivial but tractable.  I produced an experimental
version of the second technique using XSLT 2.0 over a Christmas break a
few years ago, but did not have the knowledge of Asian collation to
produce anything but a Japanese index (I had a local resource familiar
enough with Japanese to help with verification).

My company purchased XSLT index collation technology later that year so
I did not pursue it beyond proof 0f concept.

I hope that the work Jirka did for the first paper will end up in the
DocBook transforms at some time in the future.  (Great paper, Jirka).

Larry Rowland


On Tue, 2006-05-09 at 16:14 +0300, Tuomas Rinta wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> I'm trying to generate an index automatically for Japanese and Chinese
> documents using docbook-xsl-1.69.1. Apparently, it seems to sort the
> indexentries in the proper order (although I'm just guessing here, I
> don't know either of the languages), but everything is grouped under the
> indextitle "Symbols", which to my understanding, is incorrect. For
> Japanese, this of course is a problem as the index title might be a
> totally different character than expected (might not even appear in the
> indexterm).
> 
> I'm using autoidx-ng.xsl to create the index. Is there any way to
> generate the proper index titles for the index with DocBook?
> 
> Best regards,
> Tuomas Rinta
> 
> - --
> Tuomas Rinta
> AAC Global Oy / AAC Solutions
> Tel: +358 40 826 0381
> tuomas.rinta@aacglobal.com
> http://www.aacglobal.com/
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.2 (MingW32)
> Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
> 
> iD8DBQFEYJWbXxFSEN1+g5kRAnzMAJ9iQTXPcRuw7ypYKX8lo3iJdTtvHgCfRSs1
> 6wIlBqRGPzpdaTkdhe/IN2Y=
> =ELkw
> -----END PGP SIGNATURE-----
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: docbook-apps-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: docbook-apps-help@lists.oasis-open.org
> 


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]