OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

search-ws message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: RE: [search-ws] queryn: A proposal for SRU to facilitate forms processing


Title: RE: [search-ws] queryn: A proposal for SRU to facilitate forms processing

Interesting idea. You mean to use the "queryType" parameter? It's still CQL right, although fractured. So I guess one could take advantage of that param to indicate that state of affairs instead of using a new param.  But then one would still want to communicate the number of search clauses (rows) being generated. Where would that fit in?

My take was that "queryn" would provide the indicator (and thus proxy for "query" parameter), as well as providing the number of search clauses.

Tony


-----Original Message-----
From: Ray Denenberg, Library of Congress [mailto:rden@loc.gov]
Sent: Wed 12/8/2010 3:41 PM
To: Hammond, Tony; 'LeVan,Ralph'; Denenberg, Ray; 'OASIS SWS TC'
Subject: RE: [search-ws] queryn: A proposal for SRU to facilitate forms processing

Tony - could this be done instead  as a separate query type?     --Ray



From: Hammond, Tony [mailto:t.hammond@nature.com]
Sent: Wednesday, December 08, 2010 3:45 AM
To: LeVan,Ralph; Denenberg, Ray; OASIS SWS TC
Subject: RE: [search-ws] queryn: A proposal for SRU to facilitate forms
processing



Hi:

Before we dismiss this proposal out of hand a couple more word are in order:

> I'm not excited by added this.  It's a very uninteresting class of forms
clients that can't do javascript.

We offer a commercial search service on our platform based on OpenSearch. We
do use JavaScript liberally on the platform but cannot assume users will
choose to do so. We therefore *require* a failover to server-side
technologies. It's that simple.

> The only way they would appear that way to the server is if the user
filled in all those fields in the form.

This is not the case. As I mentioned in my earlier message postscript the
initial consideration was to fragment the CQL strings arbitrarily and then
to recombine in strict sequence order. However, if term values were not
supplied the resulting CQL string would be invalid.

Instead I modified this proposal to use a matrix approach whereby search
clauses would be numbered sequentially - as rows - and the individual
components (index, relation, term, boolean) would be the columns. Hence any
search clause with an empty term value could be skipped entirely. And the
terminal boolean is also always to be omitted from the reckoning. Index,
relation and boolean values are supplied by the form. Only the term values
are entered by a user and only those associated search clauses are ever
considered.

We know that this approach works. This is what we currently use in the
JavScript used by our forms handler which comes with the
"explainReponse.xsl" stylesheet that ships with the "oclcsrw" package. (Btw,
must say the developer has done us proud here. Excellent job!) The only
difference here is that I have amended the naming somewhat and also the
intent. For naming I have suggested "q{n}.idx" for "index{n}", "q{n}.rel"
for "relat{n}", etc. I have also proposed "queryn" for a user suppled value
instead of the dynamically computed "maxItems", since this accords better
with "query". (Recall too that "query" - and "queryn" if adopted - is the
signal for a "searchRetrieve" operation.)

The intent too is different in that the current JavaScript is destined for
one form only whereas this approach is proposed as a general method to be
used by web forms for client-side reassembly using JavaScript or failing
over to server-side reassembly. Most search web forms fit to this simple
type of matrix description however the presentation is crafted.

One of the main problems with SRU adoption is the difficulty of constructing
the CQL querystring which must be presented intact. I cannot emphasize this
point enough. SRU currently does not allow for fragmented CQL querystrings
which are what a forms interface naturally provides for and which is the
primary means for an end user to interact with an SRU endpoint. Also even if
fragmented query components were supported there is still the difficulty of
reconciling the CQL triple (index, relation, term) with the basic key/value
pair. (Other query languages pay less heed to relations and so map more
readily index and term to key/value pairings.)

This proposal could be accommodated within a non-normative annex as a
general technique for dealing with web forms. However if there is no
obligation on a server to recognize this technique then it cannot be safely
relied upon and so must necessarily limit the range of clients that SRU can
(or is willing) to support. It may be that SRU will only support JavaScript
enabled clients.

We ought to be worried.

Tony

ps/
I had this all written out much better (in English too) but lost the whole
text and had to rewrite.




-----Original Message-----
From: LeVan,Ralph [mailto:levan@oclc.org]
Sent: Wed 12/8/2010 5:26 AM
To: Hammond, Tony; Ray Denenberg, Library of Congress; OASIS SWS TC
Subject: RE: [search-ws] queryn: A proposal for SRU to facilitate forms
processing

I'm not excited by added this.  It's a very uninteresting class of forms
clients that can't do javascript.



But, mostly I don't think it works.  The google example uses unrelated
parameter names for the parts of the query.  In Tony's example, the
parts are numbered sequentially.  The only way they would appear that
way to the server is if the user filled in all those fields in the form.
What if fields are omitted?  Those fields, with their sequential names)
would not be sent.  We'd have gaps in the numbering.  What if there were
more Booleans than operands?



Let's not.



Ralph



From: Hammond, Tony [mailto:t.hammond@nature.com]
Sent: Tuesday, December 07, 2010 11:30 AM
To: Ray Denenberg, Library of Congress; OASIS SWS TC
Subject: [search-ws] queryn: A proposal for SRU to facilitate forms
processing



Hi:

I wanted to put this (modedst) proposal for SRU forward and get some
feedback.

One of the differences between SRU and other general search interfaces
is that the actual query (CQL string) is contained within a single
parameter and not scattered across several parameters, as e.g. this
search in Google:


http://www.google.co.uk/search?q=this+-that=en=10==i=countryAU=images=qd
r:w

This is a query for "this" and not "that" in Australian sites in the
past week.

  &q=this+-that
  &cr=countryAU
  &tbs=qdr:w

Yep, it's a bit of a mess. :) Mixes together query and control params.
But still it's straightforward to map to from a forms interface. I
always think of traditional query interfaces as being 1-D and SRU as
being 2-D: one dimension for query, and the other for control. And this
separation of concerns is both a blessing and a curse. A curse
especially for implementors.

Now one of the difficulties with a forms input for SRU is that the CQL
query needs to be composed before it is added to the querystring as a
single parameter which usually means some clever stylesheet handling of
the query fields (which we are currently using from the oclcsrw package)
or some other preprocessing method.

I was wondering whether if SRU had a new parameter "queryn" say which
gave an integer number of query search clauses across which the query
was fragmented then the query could be simply recomposed in a
predetermined fashion.

E.g. if one had something like:

  &queryn=2
  &q1.idx=index1
  &q1.rel=relation1
  &q1.trm=term1
  &q1.bln=boolean1
  &q2.idx=index2
  &q2.rel=relation2
  &q2.trm=term2
  &q2.bln=boolean2

then the parameters could be sent direct from the form without any
handling and composed on the server side by following a simple rule,
i.e. concatenation of (known number of) search clause components with
whitespace separators, and concatenation of search clauses with
(whitespaced) booleans. So, in above example with n=2 params it would be
straightforward for a querystring builder to look for params "q1.*"
through "q2.*" and build the CQL query as

  query = '';
  for (i=1; i <= queryn; i++) {
    if (q{i}.trm) {
      query += q{i}.idx + ' ' + q{i}.rel + ' ' + q{i}.trm;
    }
    if (i < queryn) { query += ' ' + q{i}.bln + ' '; }
  }

i.e.

  query = q1.idx + ' ' + q1.rel + ' ' + q1.trm + ' ' + q1.bln + ' ' +
q2.idx + ' ' + q2.rel + ' ' + q2.trm
        
As long as a form laid out query components in a defined (numbered)
fashion and then declared the total number of search clauses then the
query builder just needs to iterate over the known number of search
clauses.

Alternately the query could be assembled on the client using JavaScript
such as the "mungeForm" function we have on nature.com OpenSearch via
the oclcsrw package. And if a client had disabled JavaScript then the
server itself could detect the "queryn" parameter and reassemble the
query. Of course this really means that

  searchRetrieve = query | queryn (=> query = q1.* + q2.* + ...)

Such an extension to SRU could certainly provide ample support for
simple forms - such as most in practice invariably are - without
requiring special JavaScript or bespoke handling. Of course, it is very
limiting in terms of query expressivity although it does map reasonably
well to standard form inputs.

What do you think? Interested to hear your feedback on this general
approach to (re)assembling CQL queries.

Thanks,

Tony

ps/
In an earlier attempt I had considered just breaking a CQL query into an
arbitrary number of string fragments which could be resequenced into a
complete CQL string but ran into a problem concerning empty terms which
would break the validity of the CQL. Hence this revised approach which
is more of a matrix method with index, relation, term (and boolean)
correlated and identified by row order.


************************************************************************
******** 
DISCLAIMER: This e-mail is confidential and should not be used by anyone
who is
not the original intended recipient. If you have received this e-mail in
error
please inform the sender and delete it from your mailbox or any other
storage
mechanism. Neither Macmillan Publishers Limited nor any of its agents
accept
liability for any statements made which are clearly the sender's own and
not
expressly made on behalf of Macmillan Publishers Limited or one of its
agents.
Please note that neither Macmillan Publishers Limited nor any of its
agents
accept any responsibility for viruses that may be contained in this
e-mail or
its attachments and it is your responsibility to scan the e-mail and
attachments (if any). No contracts may be concluded on behalf of
Macmillan
Publishers Limited or its agents by means of e-mail communication.
Macmillan
Publishers Limited Registered in England and Wales with registered
number 785998
Registered Office Brunel Road, Houndmills, Basingstoke RG21 6XS 
************************************************************************
********


****************************************************************************
****  
DISCLAIMER: This e-mail is confidential and should not be used by anyone who
is
not the original intended recipient. If you have received this e-mail in
error
please inform the sender and delete it from your mailbox or any other
storage
mechanism. Neither Macmillan Publishers Limited nor any of its agents accept
liability for any statements made which are clearly the sender's own and not
expressly made on behalf of Macmillan Publishers Limited or one of its
agents.
Please note that neither Macmillan Publishers Limited nor any of its agents
accept any responsibility for viruses that may be contained in this e-mail
or
its attachments and it is your responsibility to scan the e-mail and
attachments (if any). No contracts may be concluded on behalf of Macmillan
Publishers Limited or its agents by means of e-mail communication. Macmillan

Publishers Limited Registered in England and Wales with registered number
785998
Registered Office Brunel Road, Houndmills, Basingstoke RG21 6XS  
****************************************************************************
****



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]