OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

docbook-apps message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: WebHelp, English stemmer, problems with specific words


Title: Default Signature
Hi,

I found the conversation about problems with the stemmer used with English at http://lists.oasis-open.org/archives/docbook-apps/201103/msg00040.html very informative in tracking down the problem I'm having with the stemmer, which is similar. In my case, the word that isn't being stemmed correctly is "relay".(It comes out as "relai".) This does break searches: searching for "relay" in a document that should have six matches returns an error "Your search returned no results for relai".

The solution that I've implemented locally, and offer below for your consideration, is a list of words to be stemmed manually. I've tried to follow your coding style but I'm not a serious _javascript_ hacker so I may have stepped on some toes inadvertently.
 
Regards,
Paul Bort
Systems Engineer
TMW Systems, Inc.
pbort@tmwsystems.com
 
----------------------------------

--- en_stemmer.js
+++ en_stemmer.js
@@ -54,6 +54,14 @@
         meq1 = "^(" + C + ")?" + V + C + "(" + V + ")?$",  // [C]VC[V] is m=1
         mgr1 = "^(" + C + ")?" + V + C + V + C,       // [C]VCVC... is m>1
         s_v = "^(" + C + ")?" + v;                   // vowel in stem
+   
+    var exceptionWords = {
+            "relay":"relay",
+            "relaying":"relay",
+            "relays":"relay",
+            "nucleus":"nucleus",
+            "zeus":"zeus"
+        };
 
     return function (w) {
         var     stem,
@@ -67,6 +75,8 @@
 
         if (w.length < 3) { return w; }
 
+        if (w in exceptionWords) { return exceptionWords{w}; }
+       
         firstch = w.substr(0,1);
         if (firstch == "y") {
             w = firstch.toUpperCase() + w.substr(1);



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]