Dear Colleagues,
I have been working on supporting the publish,
management and discovery of image content (such as
files with format of JPEG, PNG, GIF etc.) in an
ebXML RegRep.
I would like to discuss ideas and initial thoughts
on a extension profile specification with the
proposed title "ebXML RegRep Profile for Image
Resources".
During the course of this work I found that unlike
other profiles, the Image Profile requires a much
larger number of metadata attributes by which to
allow searching for image resources. These metadata
attributes are defined by specifications such as [
EXIF-2.2]
and [
IPTC-2008].
I also found that many of these attributes are
numeric and search needs to support finding matches
to numeric attributes such as imageWidth and
imageLength to be within a certain min/max range.
These unique differences made it rather unwieldy to
define a canonical parameterized query to discover
image resources by. There would be just too many
parameters in the traditional approach.
The existing
canonical
Adhoc Query would be the solution to this
problem. However, RegRep core does not define a
standard query language syntax that could allow
using the Adhoc Query in an interoperable manner
across registry implementations. One registry could
support SQL query while another could support XQUery
and the query schema could be quite different even
for the same query language.
This led me to consider the CQL [
SearchRetrievePt5]
as a implementation neutral query language syntax
for querying image profile data as well as any other
profile's data and core ebrim RegistryObject
metadata. The basic idea is to define a CQL context
set for each such profile. The context set defines a
set of indexes that can be used in searching data
for that context. For example, in Image Profile you
can have exif.imageWidth and exif.imageHeight
indexes that allow searching for images by their
pixels height and width. The index definition would
also specify how the index relates to
RegistryObjects for that type of data (e.g. images).
Information would include data type for each index
as well as a set of relations that could be used
(e.g. "=", "<", ">" etc...).
Here are some examples for CQL queries that could be
used to search for image content. A query could
contain any number of predicates combined using
boolean operators like AND, OR.
- Find by creation date: exif.dateTime >=
"2008-07-13T21:05:34"
- Find images where width and height are both
>= 300 pixels: exif.imageWidth >= 300 AND
exif.imageHeight >= 300
- Find by f-number: exif.fNumber >= 2.8
- Find images by GeoLocation: exif.geoLocation
WITHIN POLYGON((59 22, 78 22,78 38, 59 38, 59
22))
- Find by creator: iptc.creator.name =
"*farrukh*najmi*"
- Find by genre: iptc.intellectualGenre =
"wildlife"
The use of CQL queries can be specified using the
existing RegRep Adhoc Query protocol with a new CQL
query language. All this is proposed to be defined a
extension profile specification with the proposed
title "ebXML RegRep Profile for Contextual Query
Language (CQL)". No protocol changes or changes to
RegRep core would be needed to support CQL as an
extension profile.
The outline for both proposed extension specs are
available in our wiki here for your review:
I would like to propose that we meet on our next TC
meeting on August 17, 2012 at 12PM ET to discuss
these two proposals.
Please let me know (off list) if there is any chance
that you will be unable to attend and please share
your thoughts on this email thread until our
meeting. Thank you.