OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

docbook-apps message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [docbook-apps] Search behavior in webhelp output


Hi,

On Fri, Mar 18, 2011 at 1:21 AM, Peter Desjardins <peter.desjardins.us@gmail.com> wrote:
Hi. I am fielding some questions about the search behavior in the
webhelp output. Is there an explanation of the behavior available
somewhere?

Specifically, I need to understand:

* How substrings are handled. Why does "locale" match "localeString"
but "crea" doesn't match "create"?

The stemmed root word of create/created/creating/creat is "creat", so all of these words produce same output. Stemmed word of "crea" is "crea" itself which is actually not a word! And, "localeString" matches to "localStr" (Stemmers tend to remove suffixes such as -ing, -ed) and 'locale" matches to "local". Were these produced same output for you? 

You can check how it behaves by executing javascript command stemmer(string) via Google Chrome's console or via FindBugs for Firefox.
Results:

stemmer("create")
"creat"
stemmer("crea")
"crea"
stemmer("localeString")
"localeStr"
stemmer("locale")
"local"
 

* Is there a way to search for strings that contain special characters
like periods. Can I search for "foo.bar" by escaping the period? Can I
remove the period from the list of special characters?

As David said, this is something we need to fix. stemming does not play any part here. i.e. stemmer("foo.bar") ==  "foo.bar". Here, foo and bar got indexed separately.

Regards,
--Kasun


Thanks for your help. I have turned off stemming in case that matters.

Peter Desjardins

---------------------------------------------------------------------
To unsubscribe, e-mail: docbook-apps-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-help@lists.oasis-open.org




--
~~~*******'''''''''''''*******~~~
Kasun Gajasinghe,
University of Moratuwa,
Sri Lanka.
Blog: http://kasunbg.blogspot.com
Twitter: http://twitter.com/kasunbg


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]