search-ws message

Subject: Re: [search-ws] [Issue] Server should not be asked to maintain clientstate

From: Farrukh Najmi <farrukh@wellfleetsoftware.com>
To: "Matthew J. Dovey" <m.dovey@jisc.ac.uk>
Date: Thu, 25 Oct 2007 09:39:11 -0400

Please see inline below...

Matthew J. Dovey wrote:
> As Ashley already pointed out - this is an optionally supported feature
> not mandatory.
>   

It still violates REST principals. Those principals were designed with a 
great deal of practical experience and wisdom.

To redefine a cliche my mother use to say to me as a teenager "would you 
jump in the well if your friends jumped in the well", above is like saying
we should suggest in the spec for implementors to jump in the well but 
say that it is optional to do so.

> The feature here is that of a named result set which has a lot of
> history and discussion. When we first started on SRU we had a long
> debate over how stateless we could make the protocol.
>
> For a generally static database we can have true statelessness between
> requests, i.e. 
>
> I can send a query Q and ask for records 1 - 10. 
> I can send the same query Q and ask for records 11-20 knowing that these
> will consecutively follow from the previous 10
> I can send the same query Q and request sort by author and ask for
> records 1 - 5
> I can send the same query Q and request sort by author and ask for
> records 1 - 10 and know that the first five records will be the same as
> in the previous query
>
> For more dynamic databases (e.g. a database of web usage statistics) the
> above does not apply, hence a mechanism for sending a query and having
> the result set from that query cached so that it can be referenced
> later.
>
> Moreover, as Ashley points out, named result sets can actually make
> things more scalable - the scenario of sending a query and asking for
> the first n results, followed by asking for the next n results is very
> common, and the overhead of storing a result set from a given query for
> use in dealing with the second request is less than the overhead of
> reissuing the query!
>   

You make an implementation assumption above that to fetch 2 windows 
(ranges) of records as server must fetch all the data and return a subset.
Most implementations I know tend to use features of databases that allow 
fetching inly the records for a specified range of indexes. In such a 
(typical)
implementation there is no waste in make two separate queries in 
response to two separate requests that execute the same query but with 
different range indexes.
See for example SQL language features LIMIT/TOP/OFFSET.

Page 88-89 of the "RESTFul Web Services" book describe eloquently why 
statelessness is better for client and server design. For scalability it 
say...

"
It is easier to distribute a stateless application across load-balanced 
servers....
Scaling up is as simple as plugging more servers into the load balancer.
A stateless application is also easier to cache...
"

> This may be against REST principles which is why we've always been
> careful to call SRU rest-like ;-) However, I am not convinced that this
> is really is in violation of REST - in essence we are not maintaining
> session state between requests - what we are doing is changing state on
> the end server (i.e. asking it to store some information which can be
> accessed later). If changing the state of the end service were in
> violation of REST, then you could never use REST for updating a database
> (for instance).
>
>
>   

Perhaps we are mixing up "application state" and resource state.

Page 90 "Application State Versus Session State" from the "Restful Web 
Services Book" makes the distinction and suggest that application state 
must not be on the server. In fact to illustrate the point it uses the 
very example we are talking about (pagination within a search):

"
There are two kinds of states.... Application state which ought to live 
on the client, and resource state which ought to live on the server....

When you use a search engine, your current query and your current page 
are bits of client state. This state is different for every client....

A web service only needs to care about your application state when you 
are actually making the request. The rest of the time, it does not even 
know you exist...

Resource state is the same for every client, and its proper place is on 
the server....
"

To summarize:

    * There is a distinction between application/client state and
      resource state
    * Maintaining client state is clearly against REST
    * Experience of most experts on scalability is quite clearly
      suggesting that a stateless application is more scalable.
    * The suggestion that stateful handling of paginated searches is
      more scalable is based on a implementation specific assumption
      that does not apply to most implementations that use features like
      SQL language features LIMIT/TOP/OFFSET
    * We have normative but optional features that require server to
      maintain client state
          o In this issue I propose eliminating those features (optional
            or not) as they encourage questionable design
          o If an implementation wishes to provide the feature let that
            be outside the spec and implementation specific

Thanks.

>> -----Original Message-----
>> From: Farrukh Najmi [mailto:farrukh@wellfleetsoftware.com]
>> Sent: 25 October 2007 02:17
>> To: search-ws@lists.oasis-open.org
>> Subject: [search-ws] [Issue] Server should not be asked to maintain
>> client state
>>
>>
>> Section 4.1 "Request Parameters" defines a resultSetTTL parameter
>>     
> which
>   
>> seems to indicate that the server is expected to maintain session
>>     
> state
>   
>> for the client between requests.
>> Can someone please confirm this assumption. If this is true then this
>> is
>> a violation of REST principals. It is also quite messy to implement in
>> a
>> scalable manner.
>>
>> I suggest we identify and remove all features requiring server to
>> maintain session state for the client between requests. Thanks.
>>

Follow-Ups:
- Re: [search-ws] [Issue] Server should not be asked to maintain clientstate
  - From: Ashley Sanders <a.sanders@manchester.ac.uk>
- Re: [search-ws] [Issue] Server should not be asked to maintain client state
  - From: "Ray Denenberg, Library of Congress" <rden@loc.gov>

References:
- [Issue] Server should not be asked to maintain client state
  - From: Farrukh Najmi <farrukh@wellfleetsoftware.com>
- RE: [search-ws] [Issue] Server should not be asked to maintain client state
  - From: "Matthew J. Dovey" <m.dovey@jisc.ac.uk>