OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

docbook-apps message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: Re: [docbook-apps] Help needed testing CJK search support in webhelp

Cramer, David W (David) wrote:

> Hi there,
> Kasun Gajasinghe has been hard at work this summer on the webhelp GSoC project and has implemented all the required features, including stemming for English and German, highlighting of search results, and tokenization for Asian (Chinese, Japanese, Korean) languages, freeing the output from the frameset, and automatic toc synching.
> You can see a demo of the results of his efforts and download it to try things out on your own content from here: http://www.thingbag.net/docbook/gsoc2010/doc/content/ch02s01.html
> The instructions provide links to a version of the package for 4.x and 5.x documents.
> Feedback is welcome. Please let us know what bugs you find. In particular, we need to test the CJK search support. If you have some demo content in Chinese, Japanese, or Korean that you can share with us for testing, we'd appreciate it. I had planned to use the Chinese version of DocBook, The Definitive Guide, but have had some trouble getting my environment set up so it will build.
> We plan to provide instructions for adding stemming support for other non-CJK languages. For a number of languages<http://www.thingbag.net/docbook/gsoc2010/doc/content/ch02s04.html>, all that is required is to port the stemmer from Java to JavaScript so that it can be used on the client side.
> Thanks,
> David

Hi David,

First of all, thank you for both of you for your work, it looks very promising!
I have a few questions about how search and stemming works:
- Is it possible to add partial matches to the search results? For example, now 
if you search for install, installing, or installed, the same results are 
returned (correctly), because these words all come from install. But if you 
don't type the entire word (say, only 'inst'), there aren't any results.
- Am I right that the search engine does prefix-only matches? (nstall, *nstall, 
etc. does not work)




[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]