[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [search-ws] Groups - Contextual Query Language (cql-first-draft-april-6-2009.doc) uploaded
Thanks Nick for the comments. I've gone through them all and will issue a new draft soon. Comments are discussed below, those not covered were straightforward and have been addressed in the next draft. --Ray ----- Original Message ----- From: "Nick Nicholas" <opoudjis@optushome.com.au> To: <search-ws@lists.oasis-open.org> Sent: Monday, April 13, 2009 2:36 AM Subject: Re: [search-ws] Groups - Contextual Query Language (cql-first-draft-april-6-2009.doc) uploaded > 76: " The empty search term [example d] has no defined semantics." Eliminated this prose and the spec is simply silent on the matter. > 56: Context sets are repeatedly mentioned in the document, but are not > introduced until line 193. This is confusing, and I don't see why 2.3 > can't move as is to the start of the document. I have completely reworked the relevant parts of the spec to address this. > 86: "If multiple '.' characters are present, then the first should be > treated as the prefix/base name delimiter". This means that a context set > name cannot start with a dot? Right. Why would you want it to? > > 87: "If the prefix is not supplied, it is determined by the server". Say > explicitly it is cql.anyIndexes. Why is there a distinction between > cql.anyIndexes and cql.serverChoice? For discussion, next call. > 131: "the prox operator is". Sentence incomplete. (I'm not really > surprised... :-) Forward reference to 2.1.9? > > 149: "Within the CQL set they [proximity terms] are explicitly undefined, > subject to interpretation by the server." How can I find out how the > server has chosen to interpret them? I find the refusal to define their > behaviour a concern. People will make the obvious orthographically-based > assumptions about the meaning of word, sentence, paragraph, and will not > be happy if that is not what's implemented. I think the idea is that if you want a specific interpretation, use a context set. If we want the server to be able to explain how it interprets them, it could be part of explain. However I think the idea, further, is that the server's interpretation could depend on certain apsects of the query, this different interpretations for different queries. Thus hard to explain. For discussion at next call. > BNF: comparator, not comparitor I'll leave this for whoever volunteers to work on the BNF (we had a big debate on "comparitor" vs. "comparator" last time we discussed the BNF.) > 203: "When defining a new context set, it is necessary to provide a > description of the semantics of each item within it". No minimum > requirements for this description are provided. I've deleted the sentence. > 234: "cql.resultSetId = "a" AND cql.resultSetId = "b" " I'm surprised > this works, since the instance of the record is unique to a result set. > Does the wording of the response set data model explicitly license such > result set manipulation? This assumes that cql is being used with a protocol that declares a result set model. I've added a note to that effect. > 244: "allIndexes". Remind readers that this is not equivalent to a full > text search. For discussion at next call. I'm not sure it wouldn't be reasonable for a server to treat this as full text search. > 258: "keywords". Note that the search terms in the keywords index need > not be present in any other defined index. I agree, however it says "Exactly which fields make up this index is determined by the server, " which implies that the index is constructed from other indexes. For discussion at next call. > 271. "=". How can I find out how a server has implemented "="? From Explain (which Ralph and Janifer are working on). > > 284. "==". Remind readers that CQL does not strip whitespace, so the > index better had. For discussion at next call. > 301. "<" etc. I would insert a textual comparison example, since textual > comparisons are defined (subject to the locale), and comparisons are not > limited to numbers. I'm not sure that the CQL set is where lexical relations should be defined. We had discussion of this a couple years ago and there was mention of a "lexical" context set. For discussion at next call. > > 311. "adj". You've dodged any mention of word delimiters in the adjacency > definition, but clearly adjacency is meaningless without the notion of a > delimiter. The delimiter, again, is determined by the locale. The delimiter is intended to be understood to be space. We could add a relation modifier "delimiter=xx" For discussion. > > 352. "stem". Being the same stem is different from being the same lemma, > and often lemma is what you actually want (e.g. "computer" and > "computers" but not "computing".) I'd make the distinction here --- > especially for languages not as morphology-poor as English. For discussion. > > 362. "partial". Word fragments could also usefully be searched in normal > searches. Example, please. > > 375. "locale=value". Rather than giving illustrative examples, say that > locales are used as understood under Unix, and refer to a more canonical > listing (in whatever gizzards of BSD or Java that might reside). It'd be > nice if you could move away from "C" as a locale and just used ISO, but > that's a big ask.... Someone else can write this section since I don't understand it. > > 495. "container=field" sits clumsily with how indexes are normally > defined. Is a query like "author = jack prox author = jones" well- > defined? Is "author = jack prox title = jones"? (It shouldn't be.) Is > "author = jack prox/container=title author = jones"? (Again it shouldn't > be). The latter two are not well-defined, and the first is debatable. But the examples, are well-defined, I think: name=jones prox/container=author date=1950 Find the name 'jones' and date '1950' in the same author field. The semantics of this query are clear, and it is up to the server to determine how to process it. It doesn't mean that there necessarily are name and date subfields within the author field, though there may be, or the server applies some algorithm to determine what is the date and what is the name.
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]