search-ws message

Subject: OASIS/SRU: Represent Scan within SearchRetrieve operation - - draft message

From: "Ray Denenberg" <raydenenberg@starpower.net>
To: <search-ws@lists.oasis-open.org>
Date: Sun, 18 Nov 2007 13:11:57 -0500

First, two points:

1. On the issue of breaking up the request into three (or whatever) "meta"
parameters:  I think this is worth exploring. But I suggest we put it off,
and at least get the scan discussion underway, using the "conventional"
approach.  Then, after that discussion gets off the ground, introduce the
meta parameter proposal if we want to, after we've kicked it around here.

2. On the issue of a query type parameter, versus different names for
different query types. The query-type approach has the advantage of
determinism, which I think makes it the winner. Even though I like the
"implicit" approach because it's more elegant and easier to type. I have
represented both approaches in the draft message, below:

DRAFT MESSAGE

---------------------------------------------------------

There is a suggestion to eliminate the Scan operation, and instead represent
the scan function as part of the searchRetrieve operation. 

Consider the typical scan example:

 
http://myserver.com/sru?operation=scan&version=1.2&scanClause=dc.title=frog&;
responsePosition=1&maximumTerms=25

To simplify discussion discard the operation and version parameters. (There
are proposals to eliminate these; those discussions have not completed, but
for purposes of this duscussion assume they are discarded.)

This reduces to:

http://myserver.com/sru?scanClause=dc.title=frog&responsePosition=1&maximumT
erms=25

Next:
- Model all the terms from all indexes for a database to be database records
in that database. The record syntax for retrieval would be the fragment of
xcql that scan uses.
- Define a query type scanClause (come up with a better name later). (We
have talked about introducing multiple query types. That discussion hasn't
completed but let's accept it for the sake of this discussion.)
- Change responsePosition to startRecord.
- Change maximumTerms to maximumRecords.


And you get:

http://myserver.com/sru?scanClause=dc.title=frog&startRecord=1&maximumRecord
s=25

Then you essentially have a searchRetrieve request. 

Alternatively, introduce an explicit query type parameter, qtype:

http://myserver.com/sru?qtype=scan&query=dc.title=frog&responsePosition=1&ma
ximumTerms=25


For the query "dc.title=frog" the result set would be the term records where
'frog' is in record position 1 (or zero, subject to debate), and of course
we need to extend the definition of startRecord to allow non-positive
integers. (And add a sortby clause to the query to ensure the
terms/"records" are ordered correctly.)


Alternative approach:

Define a scan context set, with an index whose name is 'index'  whose term
values are the names of the indexes.

So a query clause could be

                     scan.index=dc.creator

The result set would be all of the dc.creator terms. (Again,  include a
sortby clause to make sure you get terms in order.) With this approach you
could present from the start of the index, if you wanted to (can't do that
with Scan now).  

Say you want to include a seed term.  Define another index in the scan
context set, 'term':

scan.index=dc.creator AND scan.term >= <term>

The result set would be all dc.creator terms beginning with <term>.


The OASIS Search Web Services TC solicits feedback on this proposed
approach.

Follow-Ups:
- Re: OASIS/SRU: Represent Scan within SearchRetrieve operation - - draft message
  - From: "Ray Denenberg, Library of Congress" <rden@loc.gov>