OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

cmis message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: [OASIS Issue Tracker] Commented: (CMIS-144) Full text search syntaxand semantics



    [ http://tools.oasis-open.org/issues/browse/CMIS-144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=10700#action_10700 ] 

David Choy commented on CMIS-144:
---------------------------------

What is the precise semantics for a Boolean conjunction (AND)?

The CONTAINS() function does not specify the FT search scope, simply because a repository's FTS capability largely depends on what objects and which values of an object are FT-indexed, and what source identities are captured in the index. For some repositories, a single document may have multiple FT-index values: the separate values of a multi-valued property, the values of different properties, the content-stream. This offers a finer-grain FTS capability. For others, all these are treated as parts of a single logical FT value.

Now, a Boolean expression "proposition-A AND proposition-B" is true iff both prop-A and prop-B are true for the same value. What is "the same value" here? Does it mean the same property value or the same content stream? A FTS implementation that treats everything indexed for a document as a part of a single logical value would not be able to support this fine-grain semantics. Then, should it mean everything associated with a document that is FT-indexed? A FTS implementation that uses property value id as source id but not document id would have difficulty to support this coarse semantics without a great deal of post processing.

The same ambiguity applies to disjunction (OR) and negation (-), although the problem may be less severe in practice.

Should there be a fixed semantics, or would this be repository-specific?

> Full text search syntax and semantics
> -------------------------------------
>
>                 Key: CMIS-144
>                 URL: http://tools.oasis-open.org/issues/browse/CMIS-144
>             Project: OASIS Content Management Interoperability Services (CMIS) TC
>          Issue Type: Bug
>          Components: Domain Model
>    Affects Versions: Draft 0.60
>            Reporter: David Caruana
>            Assignee: Ethan Gur-esh
>
> The text search expression is defined as a <character string literal> (as defined by SQL-92).  However, the syntax and semantics of the full text search expression are repo specific.
> I remember there was some resistance to defining a 'lowest common denominator' full text search language, but I don't remember why.
> Given that we define SQL, and that query is a key use case, I think there's value in a deeper FTS definition.
> As a starting point, JCR provides minimal definition. I'm not sure we would need to much further than that to start with.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://tools.oasis-open.org/issues/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]