Dear Colleagues,
I have been working on supporting the publish, management
and discovery of image content (such as files with format of
JPEG, PNG, GIF etc.) in an ebXML RegRep.
I would like to discuss ideas and initial thoughts on a
extension profile specification with the proposed title
"ebXML RegRep Profile for Image Resources".
During the course of this work I found that unlike other
profiles, the Image Profile requires a much larger number of
metadata attributes by which to allow searching for image
resources. These metadata attributes are defined by
specifications such as [
EXIF-2.2] and [
IPTC-2008].
I also found that many of these attributes are numeric and
search needs to support finding matches to numeric
attributes such as imageWidth and imageLength to be within a
certain min/max range. These unique differences made it
rather unwieldy to define a canonical parameterized query to
discover image resources by. There would be just too many
parameters in the traditional approach.
The existing
canonical
Adhoc Query would be the solution to this problem.
However, RegRep core does not define a standard query
language syntax that could allow using the Adhoc Query in an
interoperable manner across registry implementations. One
registry could support SQL query while another could support
XQUery and the query schema could be quite different even
for the same query language.
This led me to consider the CQL [
SearchRetrievePt5]
as a implementation neutral query language syntax for
querying image profile data as well as any other profile's
data and core ebrim RegistryObject metadata. The basic idea
is to define a CQL context set for each such profile. The
context set defines a set of indexes that can be used in
searching data for that context. For example, in Image
Profile you can have exif.imageWidth and exif.imageHeight
indexes that allow searching for images by their pixels
height and width. The index definition would also specify
how the index relates to RegistryObjects for that type of
data (e.g. images). Information would include data type for
each index as well as a set of relations that could be used
(e.g. "=", "<", ">" etc...).
Here are some examples for CQL queries that could be used to
search for image content. A query could contain any number
of predicates combined using boolean operators like AND, OR.
- Find by creation date: exif.dateTime >=
"2008-07-13T21:05:34"
- Find images where width and height are both >= 300
pixels: exif.imageWidth >= 300 AND exif.imageHeight
>= 300
- Find by f-number: exif.fNumber >= 2.8
- Find images by GeoLocation: exif.geoLocation WITHIN
POLYGON((59 22, 78 22,78 38, 59 38, 59 22))
- Find by creator: iptc.creator.name = "*farrukh*najmi*"
- Find by genre: iptc.intellectualGenre = "wildlife"
The use of CQL queries can be specified using the existing
RegRep Adhoc Query protocol with a new CQL query language.
All this is proposed to be defined a extension profile
specification with the proposed title "ebXML RegRep Profile
for Contextual Query Language (CQL)". No protocol changes or
changes to RegRep core would be needed to support CQL as an
extension profile.
The outline for both proposed extension specs are available
in our wiki here for your review:
I would like to propose that we meet on our next TC meeting
on August 17, 2012 at 12PM ET to discuss these two
proposals.
Please let me know (off list) if there is any chance that
you will be unable to attend and please share your thoughts
on this email thread until our meeting. Thank you.
--
Regards,
Farrukh Najmi
Web: http://www.wellfleetsoftware.com