OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

search-ws message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: OASIS/SRU: Eliminate Scan operation (merge searchRetrieve and Scan functions)


This is the seventh of a number of topics initiated by the OASIS Search Web
Services Technical Committee, for discussion within the SRU list.

 Issue: Eliminate Scan Operation
 Issues List: http://wiki.oasis-open.org/search-ws/issues
 Issue Id: ray-1
 Thread: http://lists.oasis-open.org/archives/search-ws/200711/msg00042.html
and

http://lists.oasis-open.org/archives/search-ws/200711/msg00073.html


 There is a suggestion within the OASIS Search Web Services TC to eliminate
the Scan operation, and instead represent  the scan function as part of the
searchRetrieve operation.

Some committee members support the idea of merging the searchRetrieve and
Scan function, others oppose the idea, and some are undecided. We would like
it to be discussed on the SRU listserv.

 Consider the typical scan example:
http://myserver.com/sru?operation=scan&version=1.2&scanClause=dc.title=frog&responsePosition=1&maximumTerms=25

 To simplify discussion discard the operation and version parameters. (There
are proposals to eliminate these; those discussions have not completed, but
 for purposes of this discussion assume they are discarded.)

 This reduces to:
http://myserver.com/sru?scanClause=dc.title=frog&responsePosition=1&maximumTerms=25

 Next:
 1. Model all the terms from all indexes for a database to be database
records  in that database. The record syntax for retrieval is the fragment
of  xcql that scan uses.
 2. Define a query type scanClause (come up with a better name later). (We
have talked about introducing multiple query types. That discussion hasn't
completed but let's accept it for the sake of this discussion.)
 3. Change responsePosition to startRecord.
 4. Change maximumTerms to maximumRecords.

 And you get:
http://myserver.com/sru?scanClause=dc.title=frog&startRecord=1&maximumRecords=25,
essentially, a searchRetrieve request.

 (Alternatively, instead of different parameter names for different query
types, introduce an explicit query type parameter, qtype:
http://myserver.com/sru?qtype=scan&query=dc.title=frog&responsePosition=1&maximumTerms=25)

 For the query 'dc.title=frog' the result set would be the term records
where  'frog' is in record position 1 (or zero, subject to debate), and of
course  we need to extend the definition of startRecord to allow
non-positive integers. (And add a sortby clause to the query to ensure the
terms/"records" are ordered correctly.)

 Alternative approach:

 Define a scan context set, with an index whose name is 'index'  whose term
values are the names of the indexes.

 So a query clause could be 'scan.index=dc.creator'

 The result set would be all of the dc.creator terms. (Again,  include a
sortby clause to make sure you get terms in order.) With this approach you
 could present from the start of the index, if you wanted to (can't do that
with Scan now).

 If you want to include a seed term:  Define another index in the scan
context set, 'term':

 scan.index=dc.creator AND scan.term >= <seed term>

 The result set would be all dc.creator terms beginning with <seed term>.


  The OASIS Search Web Services TC solicits feedback on this proposal and on
the various possible approaches.


--Ray Denenberg




[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]