OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

search-ws message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [search-ws] sru:recordPacking = unpacked ?


Title: Re: [search-ws] sru:recordPacking = unpacked ?
Hi Ray:

Great. We’re on the way. :)

Let me talk about XML, but a word about JSON first. We have used a direct ATOM mapping to inform our JSON structure. There are a couple issues here. JSON does not support repeated keys so we need to gather repeated properties up into arrays keyed off a single name, e.g. “entry”: [ {...}, {...}, ...] for multiple entries. There’s also the namespaces issue. But if you look at our JSON you will see that it is just a fairly faithful JSON mapping of an ATOM document. (Note that the Google AJAX Feed API has a more bastardized version of ATOM mapping to JSON: <http://code.google.com/apis/ajaxfeeds/documentation/#JSON>.)

Yes, this record data unpacking is a meaningful issue for XML formats too. The real issue is not the format as such so much as the schema constraining the format.

SRU/XML is an XML that is governed by a W3C schema and this can be considered a ‘closed’ format. This has upsides and downsides. Upside is for inventory management. Elements will be present as declared in the schema and in the quantity specified by the schema and will have values typed as specified in the schema. Downside is that foreign elements are not admitted – hence the format is ‘closed’. And that type specification for vanilla string-type properties does not really add very much.

By contrast, ATOM and RSS are both ‘open’ XML formats and are not governed by a restrictive schema practice.

== ATOM

RFC 4287 (“The Atom Syndication Format”) has this to say about schema in Sect 1.3 – Notational Conventions:

   http://tools.ietf.org/html/rfc4287#section-1.3

   “Some sections of this specification are illustrated with fragments of
   a non-normative RELAX NG Compact schema [RELAX-NG].  However, the
   text of this specification provides the definition of conformance.  A
   complete schema appears in Appendix B.”
 
It explicitly discusses extending Atom (see Sect. 6) and ahs this to say in Sect. 6.4 about extension elements:

   http://tools.ietf.org/html/rfc4287#section-6.4

  “Atom allows foreign markup anywhere in an Atom document, except where
   it is explicitly forbidden. ...”


== RSS

RSS has various strains – but these days only 1.0 and 2.0 are real players. We have always favoured 1.0 over 2.0 because of the chequered history of 2.0 which originally had no support for namespaces (though now there is recognition) and has no decent data model. By contrast 1.0 has a well known data model and the RSS 1.0 spec has this to say about schemas:

  http://web.resource.org/rss/1.0/

  “RSS has an RDF Schema <http://purl.org/rss/1.0/schema.rdf>  available. A copy is also embedded in this document.”

In other words, RSS 1.0 conforms to the RDF data model. A valid instance can be parsed the W3C RDF Validator and can be output as triples. A valid instance is readkily converted to other RDF serializations: RDF/N3, RDF/JSON, RDF/XML, etc. For instance, you could take one of our journal RSS feeds (<http://www.nature.com/nature/current_issue/rss>) and feed that to the RDF Validator (<http://www.w3.org/RDF/Validator/>)

(Try this short URL which does that for you: <http://bit.ly/HQ8Oj>. )

RDF is an open data model which allows additional properties (or foreign markup) to be added.

(RSS 2.0 has explicit support – now - for namespaces and thus admits of foreign markup. It does not have such a well known data model.)

==

So mapping SRU responses onto open XML carrier formats (such as RSS or ATOM) means that there is space for manoeuvering the data to make it more accessible.

Here’s how our ATOM feeds look like with ‘unpacked’ SRU record format – short URL here <http://bit.ly/1OhIty>.

See how we’ve just bubbled the SRU record data up to the item level. This is an OpenSearch friendly result while maintaining a complete SRU response.

<entry>
        <title>The geomicrobiology of gold</title>
        <link href=""http://dx.doi.org/10.1038/ismej.2007.75"/>         <id>doi:10.1038/ismej.2007.75</id>
        <updated>2009-09-19T09:02:07+00:00</updated>
        <content type="xhtml">...</content>

        <dc:identifier>doi:10.1038/ismej.2007.75</dc:identifier>
        <dc:title>The geomicrobiology of gold</dc:title>
        <dc:creator>Frank Reith</dc:creator>
        <dc:creator>Maggy F Lengke</dc:creator>
        <dc:creator>Donna Falconer</dc:creator>
        <dc:creator>David Craw</dc:creator>
        <dc:creator>Gordon Southam</dc:creator>
        <prism:publicationName>The ISME Journal</prism:publicationName>
        <prism:issn>1751-7362</prism:issn>
        <prism:eIssn>1751-7370</prism:eIssn>
        <prism:doi>10.1038/ismej.2007.75</prism:doi>
        <dc:publisher>Nature Publishing Group</dc:publisher>
        <prism:publicationDate>2007-09-20</prism:publicationDate>
        <prism:volume>1</prism:volume>
        <prism:number>7</prism:number>
        <prism:startingPage>567</prism:startingPage>
        <prism:url>http://dx.doi.org/10.1038/ismej.2007.75</prism:url>
        <prism:copyright>© 2007 International Society for Microbial Ecology</prism:copyright>
        <prism:alternateTitle>ismej</prism:alternateTitle>
 
       <sru:recordSchema>info:srw/schema/11/pam-v2.1</sru:recordSchema>
        <sru:recordPacking>unpacked</sru:recordPacking>
        <sru:recordData>
            <pam:message xmlns:pam="http://prismstandard.org/namespaces/pam/2.0/" xsi:schemaLocation="http://prismstandard.org/namespaces/pam/2.0/ http://www.prismstandard.org/schemas/pam/2.1/pam.xsd">
                <pam:article>
                    <xhtml:head xmlns:xhtml="http://www.w3.org/1999/xhtml"/>
                </pam:article>
            </pam:message>
        </sru:recordData>
        <sru:recordPosition>1</sru:recordPosition>
    </entry>

And likewise for our RSS feeds with ‘unpacked’ SRU record format – short URL here <http://bit.ly/15hFJn>.

<item rdf:about="http://dx.doi.org/10.1038/ismej.2007.75">
        <title>The geomicrobiology of gold</title>
        <link>http://dx.doi.org/10.1038/ismej.2007.75</link>
        <content:encoded><![CDATA[...]]</content>

        <dc:identifier>doi:10.1038/ismej.2007.75</dc:identifier>
        <dc:title>The geomicrobiology of gold</dc:title>
        <dc:creator>Frank Reith</dc:creator>
        <dc:creator>Maggy F Lengke</dc:creator>
        <dc:creator>Donna Falconer</dc:creator>
        <dc:creator>David Craw</dc:creator>
        <dc:creator>Gordon Southam</dc:creator>
        <prism:publicationName>The ISME Journal</prism:publicationName>
        <prism:issn>1751-7362</prism:issn>
        <prism:eIssn>1751-7370</prism:eIssn>
        <prism:doi>10.1038/ismej.2007.75</prism:doi>
        <dc:publisher>Nature Publishing Group</dc:publisher>
        <prism:publicationDate>2007-09-20</prism:publicationDate>
        <prism:volume>1</prism:volume>
        <prism:number>7</prism:number>
        <prism:startingPage>567</prism:startingPage>
        <prism:url>http://dx.doi.org/10.1038/ismej.2007.75</prism:url>
        <prism:copyright>© 2007 International Society for Microbial Ecology</prism:copyright>
        <prism:alternateTitle>ismej</prism:alternateTitle>

        <sru:recordSchema>info:srw/schema/11/pam-v2.1</sru:recordSchema>
        <sru:recordPacking>unpacked</sru:recordPacking>
        <sru:recordData>
            <pam:message xmlns:pam="http://prismstandard.org/namespaces/pam/2.0/" xsi:schemaLocation="http://prismstandard.org/namespaces/pam/2.0/ http://www.prismstandard.org/schemas/pam/2.1/pam.xsd">
                <pam:article>
                    <xhtml:head xmlns:xhtml="http://www.w3.org/1999/xhtml"/>
                </pam:article>
            </pam:message>
        </sru:recordData>
        <sru:recordPosition>1</sru:recordPosition>
    </item>

Hope that this begins to make more sense now. Note that we could just return OpenSearch feeds which had no SRU elements but as I have indicated before we do see merit in utilizing the SRU work and prefer to return our OpenSearch results as full SRU responses so that we can include other bits of the search/response equation: XML schema, diagnostics, etc.

Cheers,

Tony



On 18/9/09 20:44, "Ray Denenberg, Library of Congress" <rden@loc.gov> wrote:

I'm finally able to make some sense out of thi, I think.  The concrete example (that I am finally able to uncover from those pesky URLs) is a JSON file, and my questions are:

- Is this a meaningful issue only when the response format is other than XML?   

- Or Is it meaningful when the response format is XML but not the standard SRU response (e.g. ATOM)?

-  This seems to assume that there is a cannonical (or natural) mapping of the SRU Response format in its XML form to a JSON form.  So "unpack" is a directive to use a variation of the form.  First of all I'm not sure that the first assumption is valid (although I don't know enough about JSON to know for sure) but second, why not just define a concrete JSON schema that prescribes this alternative format?

- And if the answer to the second question is yes, then the same would apply to an XML format like ATOM.

 
I hope these questions make sense.

 
--Ray

----- Original Message -----
 
From:  Hammond,  Tony <mailto:t.hammond@nature.com>  
 
To: Ray Denenberg, Library of Congress <mailto:rden@loc.gov>  ; search-ws@lists.oasis-open.org  
 
Sent: Friday, September 18, 2009 12:45  PM
 
Subject: Re: [search-ws]  sru:recordPacking = unpacked ?
 

Hi Ray:

Seems like we’re moving like ships in  the night. Not quite sure where the difficulty lies. (Well I do. I guess it’s  coming to terms with alternative formats for SRU disclosure.) The example you  attached is for an SRU/XML response. An ‘unpacked’ record could have no other  logical effect than to remove the data (or to ignore the request). I thought  this was clear from the  writeup:

    “
For the standard SRU XML response, a  value of 'unpacked' would have the effect of simply removing the data content  from the 'recordData' field. For other open SRU host formats, a value of  'unpacked' would relocate the data from the 'recordData' field up to a higher  level within the item level unit of the response.”

So the effect on the  SRU/XML example you sent me would presumably be as below – entirely logical  but not too useful since the data is vanished away! The JSON example I sent in  my mail shows packed./unpacked variations.

The real  reason for this parameter value setting is twofold: 1) to allow SRU responses  to be hosted by other carrier formats than SRU/XML, and 2) to allow the  data in an sru:recordData field to migrate to a more visible place within the  carrier format’s item container. Note the data is still managed within the  item (=record) container so there is no possibility of data wandering across  item container boundaries.

The URLs I sent (unless broken by mail text  wrap) would show the effect in a live  environment.

Cheers,

Tony

==

<zs:searchRetrieveResponse>
<zs:version>1.1</zs:version>
<zs:numberOfRecords>2043</zs:numberOfRecords>
    <zs:records>
        <zs:record>
            <zs:recordSchema>info:srw/schema/1/dc-v1.1</zs:recordSchema>
            <zs:recordPacking>unpacked</zs:recordPacking>
            <zs:recordData>
                <srw_dc:dc  xsi:schemaLocation="info:srw/schema/1/dc-schema http://www.loc.gov/standards/sru/resources/dc-schema.xsd">
                </srw_dc:dc>
            </zs:recordData>
            <zs:recordPosition>2</zs:recordPosition>
        </zs:record>
        <zs:record>
            <zs:recordSchema>info:srw/schema/1/dc-v1.1</zs:recordSchema>
            <zs:recordPacking>unpacked</zs:recordPacking>
            <zs:recordData>
                <srw_dc:dc  xsi:schemaLocation="info:srw/schema/1/dc-schema http://www.loc.gov/standards/sru/resources/dc-schema.xsd">
                </srw_dc:dc>
            </zs:recordData>
            <zs:recordPosition>3</zs:recordPosition>
        </zs:record>
    </zs:records>

==

On  18/9/09 15:00, "Ray Denenberg, Library of Congress" <rden@loc.gov>  wrote:

 
Thanks for this, Tony, but unfortunatey it does not  clear up my confusion.  I am still left with the impression that this  does not recognize the case where there is more than a single record in the  response.   What I would like to see is an example of the  alternative form that you suggest as applied to a response that has, say,  two records.  
 
Please see the attached file representing  an SRU response,  and re-write it according to your proposed  alternative format.  Thanks.   --Ray
 
 
 
 
----- Original Message -----  
From: "Hammond, Tony" <t.hammond@nature.com
<mailto:t.hammond@nature.com>  >
To: "Ray Denenberg, Library of Congress"  <rden@loc.gov
<mailto:rden@loc.gov> >; <search-ws@lists.oasis-open.org <mailto:search-ws@lists.oasis-open.org>  >
Sent: Thursday, September 17, 2009 11:40  AM
Subject: Re: [search-ws] sru:recordPacking = unpacked ?

> Hi  Ray:
>
> As requested a full writeup of recordPacking attached  as a Word doc.
>
> Note that my proposal is merely a  generalization of the existing
> recordPacking parameter. And I have  in fact relented on my earlier view on
> 'string' and think that this  is OK now and can be just regraded as a fairly
> loose data  disposition.
>
> So, all I am doing is recognizing that SRU  responses can be mapped to
> external host carrier formats and that  the data does not need to be (and
> usually is best not) confined to a  nested position in the response
> organization but can made available  immediately under the item level unit.
>
> The attached writeup  has suitably opaque wording I think for a standards
> doc. :)
>  
> Example URLs are listed below (on our dev-server) to illustrate  this in
> practice: 1-3A are for RSS, 1-3B are for JSON (actually  JSONP which is
> easier to read as of media type text/javascript).  (You can anyway run new
> examples from the form page at the base  URL.)
>
> Also for ready reference is an example hosted format  (JSON) below which
> shows both packed and unpacked record data. (The  unpacked record data
> bubbles up to the top of the item level  container.)
>
> Cheers,
>
> Tony
>
>  
>
> == Example URLs ==
>
> 1A. RSS - default  record packing (xml)
>
> http://dev-www.nature.com/opensearch/request?query=vampire&httpAccept=application%2Frss%2Bxml <http://dev-www.nature.com/opensearch/request?query=vampire&amp;httpAccept=application%2Frss%2Bxml>
<http://dev-www.nature.com/opensearch/request?query=vampire&amp;httpAccept=application%2Frss%2Bxml>  
>
>
> 2A. RSS - explicit record  packing (xml)
>
> http://dev-www.nature.com/opensearch/request?query=vampire&httpAccept=applic <http://dev-www.nature.com/opensearch/request?query=vampire&amp;httpAccept=applic>
<http://dev-www.nature.com/opensearch/request?query=vampire&amp;httpAccept=applic>  
>  ation%2Frss%2Bxml&recordPacking=xml
>
> 3A. RSS - explicit  record packing (unpacked)
>
> http://dev-www.nature.com/opensearch/request?query=vampire&httpAccept=applic <http://dev-www.nature.com/opensearch/request?query=vampire&amp;httpAccept=applic>
<http://dev-www.nature.com/opensearch/request?query=vampire&amp;httpAccept=applic>  
>  ation%2Frss%2Bxml&recordPacking=unpacked
>
>
> 1B.  JSON - default record packing (xml)
>
> http://dev-www.nature.com/opensearch/request?query=vampire&httpAccept=text%2 <http://dev-www.nature.com/opensearch/request?query=vampire&amp;httpAccept=text%2>
<http://dev-www.nature.com/opensearch/request?query=vampire&amp;httpAccept=text%2>  
> Fjavascript
>
>
> 2B.  JSON - explicit record packing (xml)
>
> http://dev-www.nature.com/opensearch/request?query=vampire&httpAccept=text%2 <http://dev-www.nature.com/opensearch/request?query=vampire&amp;httpAccept=text%2>
<http://dev-www.nature.com/opensearch/request?query=vampire&amp;httpAccept=text%2>  
> Fjavascript&recordPacking=xml
>  
> 3B. JSON - explicit record packing (unpacked)
>
> http://dev-www.nature.com/opensearch/request?query=vampire&httpAccept=text%2 <http://dev-www.nature.com/opensearch/request?query=vampire&amp;httpAccept=text%2>
<http://dev-www.nature.com/opensearch/request?query=vampire&amp;httpAccept=text%2>  
>  Fjavascript&recordPacking=unpacked
>
>
> == Example  Hosted Format (JSON) ==
>
> ** SRU Packed Data (JSON)
>  
> "entry": [
>             {
>                 "title":  "The role of junctional adhesion molecules in
> vascular  inflammation",
>                 "link":  "http://dx.doi.org/10.1038/nri2096
<http://dx.doi.org/10.1038/nri2096>  ",
>                 "id":  "doi:10.1038/nri2096",
>                 "updated":  "2009-08-28T12:46:43+00:00",
>                 "content":  null,
>                 "sru:recordSchema":  "info:srw/schema/11/pam-v2.1",
>                 "sru:recordPacking":  "xml",
>                 "sru:recordData":  {
>                     "pam:message":  {
>                         "pam:article":  {
>                             "xhtml:head":  {
>                                 "dc:identifier":  "doi:10.1038/nri2096",
>                                 "dc:title":  "The role of junctional adhesion
> molecules in vascular  inflammation",
>                                 "dc:creator":  [
>                                     "Christian  Weber",
>                                     "Line  Fraemohs",
>                                     "Elisabetta  Dejana"
>                                 ],
>                                 ...  and prism properties ...
>                             }
>                         }
>                     }
>                 },
>                 "sru:recordPosition":  1
>             },
>             ...
>         ]
>
>
> ** SRU  Unpacked Data (JSON)
>
> "entry": [
>             {
>                 "title":  "The role of junctional adhesion molecules in
> vascular  inflammation",
>                 "link":  "http://dx.doi.org/10.1038/nri2096
<http://dx.doi.org/10.1038/nri2096>  ",
>                 "id":  "doi:10.1038/nri2096",
>                 "updated":  "2009-08-28T12:45:42+00:00",
>                 "content":  null,
>                 "dc:identifier":  "doi:10.1038/nri2096",
>                 "dc:title":  "The role of junctional adhesion molecules in
> vascular  inflammation",
>                 "dc:creator":  [
>                                     "Christian  Weber",
>                                     "Line  Fraemohs",
>                                     "Elisabetta  Dejana"
>                                 ],
>                 ...  and prism properties ...
>                 "sru:recordSchema":  "info:srw/schema/11/pam-v2.1",
>                 "sru:recordPacking":  "unpacked",
>                 "sru:recordData":  {
>                     "pam:message":  {
>                         "pam:article":  {
>                             "xhtml:head":  {
>                             }
>                         }
>                     }
>                 },
>                 "sru:recordPosition":  1
>             },
>             ...
>         ]
>
> ==
>  
>
>
>
>
>
> On 15/9/09 19:15, "Ray  Denenberg, Library of Congress" <rden@loc.gov
<mailto:rden@loc.gov> > wrote:
>
>> Tony, could you please prepare a  complete writeup of this issue, including
>> examples, that we can  walk through at next Monday's call.  I'm still having
>>  trouble figuring it out.   By Friday, if possible?  Thanks.    --Ray
>>
>>
>>
>
>  
>  ********************************************************************************    
> DISCLAIMER: This e-mail is confidential and should not  be used by anyone who is
> not the original intended recipient. If you  have received this e-mail in error
> please inform the sender and  delete it from your mailbox or any other storage
> mechanism. Neither  Macmillan Publishers Limited nor any of its agents accept
> liability  for any statements made which are clearly the sender's own and not
>  expressly made on behalf of Macmillan Publishers Limited or one of its  agents.
> Please note that neither Macmillan Publishers Limited nor  any of its agents
> accept any responsibility for viruses that may be  contained in this e-mail or
> its attachments and it is your  responsibility to scan the e-mail and
> attachments (if any). No  contracts may be concluded on behalf of Macmillan
> Publishers  Limited or its agents by means of e-mail communication. Macmillan
>  Publishers Limited Registered in England and Wales with registered number  785998
> Registered Office Brunel Road, Houndmills, Basingstoke RG21  6XS   
>  ********************************************************************************
>




>  ---------------------------------------------------------------------
>  To unsubscribe from this mail list, you must leave the OASIS TC that
>  generates this mail.  Follow this link to all your TCs in OASIS  at:
>  https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php
 <https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php>  



********************************************************************************   
DISCLAIMER: This e-mail is confidential and should not be used by anyone who is
not the original intended recipient. If you have received this e-mail in error
please inform the sender and delete it from your mailbox or any other storage
mechanism. Neither Macmillan Publishers Limited nor any of its agents accept
liability for any statements made which are clearly the sender's own and not
expressly made on behalf of Macmillan Publishers Limited or one of its agents.
Please note that neither Macmillan Publishers Limited nor any of its agents
accept any responsibility for viruses that may be contained in this e-mail or
its attachments and it is your responsibility to scan the e-mail and 
attachments (if any). No contracts may be concluded on behalf of Macmillan 
Publishers Limited or its agents by means of e-mail communication. Macmillan 
Publishers Limited Registered in England and Wales with registered number 785998 
Registered Office Brunel Road, Houndmills, Basingstoke RG21 6XS   
********************************************************************************


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]