Subject: Webhelp stemming and search indexer
I am using the DocBook XSL stylesheets (version 1.78.1) to produce Webhelp, and my documents are being translated into French, Japanese, Korean, and Simplified Chinese.
I have a couple of questions about configuring the Webhelp search which do not seem 100% obvious to me, having looked through the Webelp docs.
(2) The Java indexer command used with the Webelp build has the properties webhelp.indexer.language and enable.stemming.
In trying to establish a list of languages that have Java stemmer support, the Webhelp docs have this:
- In the section "Adding support for other (non-CJKV) languages") there is a list of non-CJKV languages that have stemmer support but no language codes.
- In the section "Search indexing" it says look in the build.properties file for the language code, but the build.properties file says look in the docs.
- In the section "New Stemmers" (in the developer docs) it seems to indicate a different list of languages with stemmers, with a list of language codes (including "cn" for Chinese?).
Question 2: If the enable.stemming property if set to true, is the value of webhelp.indexer.language used to determine whether a Java stemmer is used?
Question 3: Is there a definitive list of language codes that the Java indexer expects/accepts/supports for the language?
Question 4: If a language has no Java stemmer, is it best to set the enable.stemming property to "false", or does it not really matter?