[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: OASIS/SRU: Eliminate Scan operation (merge searchRetrieve and Scan functions)
This is the seventh of a number of topics initiated by the OASIS Search Web Services Technical Committee, for discussion within the SRU list. Issue: Eliminate Scan Operation Issues List: http://wiki.oasis-open.org/search-ws/issues Issue Id: ray-1 Thread: http://lists.oasis-open.org/archives/search-ws/200711/msg00042.html and http://lists.oasis-open.org/archives/search-ws/200711/msg00073.html There is a suggestion within the OASIS Search Web Services TC to eliminate the Scan operation, and instead represent the scan function as part of the searchRetrieve operation. Some committee members support the idea of merging the searchRetrieve and Scan function, others oppose the idea, and some are undecided. We would like it to be discussed on the SRU listserv. Consider the typical scan example: http://myserver.com/sru?operation=scan&version=1.2&scanClause=dc.title=frog&responsePosition=1&maximumTerms=25 To simplify discussion discard the operation and version parameters. (There are proposals to eliminate these; those discussions have not completed, but for purposes of this discussion assume they are discarded.) This reduces to: http://myserver.com/sru?scanClause=dc.title=frog&responsePosition=1&maximumTerms=25 Next: 1. Model all the terms from all indexes for a database to be database records in that database. The record syntax for retrieval is the fragment of xcql that scan uses. 2. Define a query type scanClause (come up with a better name later). (We have talked about introducing multiple query types. That discussion hasn't completed but let's accept it for the sake of this discussion.) 3. Change responsePosition to startRecord. 4. Change maximumTerms to maximumRecords. And you get: http://myserver.com/sru?scanClause=dc.title=frog&startRecord=1&maximumRecords=25, essentially, a searchRetrieve request. (Alternatively, instead of different parameter names for different query types, introduce an explicit query type parameter, qtype: http://myserver.com/sru?qtype=scan&query=dc.title=frog&responsePosition=1&maximumTerms=25) For the query 'dc.title=frog' the result set would be the term records where 'frog' is in record position 1 (or zero, subject to debate), and of course we need to extend the definition of startRecord to allow non-positive integers. (And add a sortby clause to the query to ensure the terms/"records" are ordered correctly.) Alternative approach: Define a scan context set, with an index whose name is 'index' whose term values are the names of the indexes. So a query clause could be 'scan.index=dc.creator' The result set would be all of the dc.creator terms. (Again, include a sortby clause to make sure you get terms in order.) With this approach you could present from the start of the index, if you wanted to (can't do that with Scan now). If you want to include a seed term: Define another index in the scan context set, 'term': scan.index=dc.creator AND scan.term >= <seed term> The result set would be all dc.creator terms beginning with <seed term>. The OASIS Search Web Services TC solicits feedback on this proposal and on the various possible approaches. --Ray Denenberg
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]