[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Subject: Re: ContentBasedQuery questions/ideas.
On Thursday 06 September 2001 12:59, Len Gallagher wrote: > > QUESTION-1 > > If the <IndexCreationRequest> only has two input parameters, then the > Client doesn't have very much control over how heavily the individual input > repository item gets indexed. The Repository would have to provide choices > for how much indexing is done. In your Word example, there might be several > ContentHandlers for Word available: > > HeavyWordHandler -- indexes all headings level 1 to 4, indexes size, > indexes all words in text paragraphs, etc. > > LightweightWordHandler -- indexes level 1 headings only. > > MiddleweightWordHandler -- indexes level 1 headings and words in Index. > > Would the <IndexCreationRequest> be attached to every submission of a Word > document, or would a SubmittingOrganization make one <IndexCreationRequest> > that applied to all subsequent submissions by that SO? Or could some > ResponsibleOrganization make one <IndexCreationRequest> that applied to all > submissions where that organization was identified as the RO? My thinking was that we could expect that the mime type of the entry content would specify which indexing handler is used, for example, maybe one of my word documents has a mime type of application/msword.heavyindex. Admittedly, this doesn't seem like a very elegant solution, so maybe some index attributes could be included at submission time to overide the default handler, e.g. <ContentIndexParameters> <UseHandler>LightweightWordHandler</UseHandler> <ArgumentList> <Argument name="remove_smart_quotes" value="true" /> </ArgumentList> </ContentIndexParameters> > QUESTION-2 > > Would it make sense to register the ContentHandlers in the Registry? If so, > then a Client could issue a query to the Registry to find out what "Slot > names" and "Slot name data types" to use for retrieving documents indexed > by that handler. One advantage of what's in Appendix D of ebRS is that we > already have a mechanism for defining and handling classifications. If we > use Slots for storing the index fields we'll have to invent a way to convey > that same kind of information to users. > Could you explain what you mean a little bit more here? I would think that one could use SlotFilter to "prequalify" entries prior to testing them against the content query. I humbly defer to you guys on this, as I am slightly behind the ball on Filter Query. -- Matthew MacKenzie XML Global <quote> I used to be an agnostic, but now I'm not so sure. </quote>
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Powered by eList eXpress LLC