OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

xri message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Pre-draft of "HTTP-based Resource Descriptor Discovery"


Please do not share this outside the TC until it is published officially.

Attached is my working copy of the 'HTTP-based Resource Descriptor Discovery' proposal. I still need to clean up the last third, but please do provide feedback both editorial and content. Editorial changes at this point should be sent directly to me. Content should be discussed on the list.

I plan to spend most of Thursday cleaning it up and pushing it out as an IETF I-D. I will then move back to working on the XRD 1.0 schema document.

There is nothing new in this document except for additional details about HTTP response codes and such. Also, it uses the current published /site-meta schema and not the new text-based format. This will be changed as soon as there is something to reference.

EHL
Title: HTTP-based Resource Descriptor Discovery
 TOC 
Network Working GroupE. Hammer-Lahav
Internet-DraftJanuary 8, 2009
Intended status: Informational 
Expires: July 12, 2009 


HTTP-based Resource Descriptor Discovery
draft-hammer-discovery-00

Status of this Memo

By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as “work in progress.”

The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt.

The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html.

This Internet-Draft will expire on July 12, 2009.

Abstract

This memo describes a method for locating information about a resource identified by a URI. The 'information about a resource' - a descriptor - typically provides machine-readable information that aims to assist and enhance the interaction with the resource. This memo only defines the method for locating the descriptor, but leaves the descriptor format out of scope.



Table of Contents

1.  Introduction
2.  Notational Conventions
3.  Resource Discovery
4.  Discovery Workflow
5.  Resource-Descriptor Link Relationship
6.  Method Selection
7.  Descriptor Location
    7.1.  <LINK> Element
    7.2.  Link Header
    7.3.  Site-meta Document
8.  Descriptor Retrieval
9.  Caching
10.  Security Considerations
11.  IANA Considerations
Appendix A.  Acknowledgments
12.  References
    12.1.  Normative References
    12.2.  Informative References
§  Author's Address
§  Intellectual Property and Copyright Statements




 TOC 

1.  Introduction

This memo aims to provide a uniform and easily implementable method for locating resource descriptors. With the development of interoperability specifications comes the need to enable compliant services and resources to declare their conformance to these specifications. There is a growing need to describe resources in a way that does not depend on their internal structure, or even the availability of an HTTP-accessible representation of these resources.

While the method described in this memo utilizes the HTTP protocol for locating descriptors, it can be used with any URI scheme and is not limited to just the 'http' and 'https' URI schemes. HTTP is an ideal framework for performing discovery activities on web resources, but it does not clearly define a mechanism for attaching a descriptor or metadata to a resource identified with a URI.

The scope of this memo is intentionally restricted to locating resource descriptors, leaving out their format. Given the wide range of use cases and information that can be provided 'about a resource', no single descriptor format can adequately accommodate all needs. However, the method in which the desired descriptor is located should be consistent across use cases and formats.



 TOC 

2.  Notational Conventions

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119] (Bradner, S., “Key words for use in RFCs to Indicate Requirement Levels,” March 1997.).



 TOC 

3.  Resource Discovery

Resource discovery provides a method for obtaining information about a resource identified with a URI. It allows resource-providers to describe their resources in a machine-readable format, enabling automatic interoperability by user-agents and resource-consuming applications. Discovery enables applications to utilize a wide range of web services and resources across multiple providers without the need to know about their capabilities in advance, reducing the need for manual configuration and resource-specific software.

When discussing discovery, it is important to differentiate between resource-discovery and service-discovery. Both types attempts to associate capabilities with resources, but they approach it from opposite ends.

Service-discovery centers around identifying the location of qualified resources, typically finding an endpoint capable of certain protocols and capabilities. In contrast, resource-discovery begins with a resource, trying to find which capabilities it supports.

A simple way to distinguish between the two types of discovery is to define the questions they are each trying to answer:

Resource-Discovery:
Given a resource, what are its capabilities, characteristics, and relationships to other resources?
Service-Discovery:
Given a set of attributes, which available resources match the desired set and what is their location?

While this memo deals exclusively with resource-discovery, it is important to note that the two discovery types are closely related and are usually used in tandem. In fact, a typical use case will switch between service-discovery and resource-discovery multiple times in a single workflow.

The reason is that resource descriptors usually contain not only a list of capabilities, but also relationships to other resources. Since those relationships are usually typed, the process in which an application chooses which links to use is in fact service-discovery.

Applications use resource-discovery to obtain the list of links, and service-discovery to choose the relevant links. In another common example, the application uses service-discovery to find a resource with a given capability, then uses resource-discovery to find out what other capabilities it supports.



 TOC 

4.  Discovery Workflow

Discovery can be performed before or after a resource is obtained. Performing discovery ahead of requesting the resource allows a resource-consumer to learn more about the properties of the resource. For example, a consumer can learn about the protocols supported by the resource and if understood, utilize them to interact with it. In many cases, discovery is performed after the resource has been obtained, based on the content of the resource and the way in which the user-agent interacts with it (or based on human interactions).

Most web applications today make strong assumptions about the resources they interact with, mostly due to lack of a standard discovery protocol for web resources. Such assumptions are not likely to disappear even with the introduction of a discovery workflow. In many cases, resource-discovery will be used as a secondary step for enhancing the interaction with a resource rather than the first step of determining how to interact at all.

While this memo is limited to identifying the location of resource descriptors, it is useful to put it in the context of the complete discovery workflow:



 TOC 

5.  Resource-Descriptor Link Relationship

The first step when performing discovery is to identify the location of the resource descriptor document for the desired resource. This can be simply described as a link between the URI of the resource and the URI of the descriptor. Links are one of the most fundamental building blocks of the web, and provide all that is necessary to define the relationship between a resource and its descriptors.

The purpose of this memo is to define and limit the methods through which this link information is obtained when performing resource-discovery. The web provides a large number of methods for defining links between resources, but in order to achieve interoperability, the selection has to be narrowed down to a much smaller set of options.

Since a single resource can have many descriptors, the descriptor link has a one-to-many structure. In the case of multiple descriptors, selecting which descriptor to use is application-specific using factors such as the descriptor document format, accessibility, and other typed relationships, and is beyond the scope of this memo.

All the methods described in this memo build directly on the typed-relationships framework defined in [I‑D.nottingham‑http‑link‑header] (Nottingham, M., “Link Relations and HTTP Header Linking,” November 2008.). The relationship type between a resource and its descriptor used for discovery is 'describedby' which was first defined by [I‑D.nottingham‑http‑link‑header] (Nottingham, M., “Link Relations and HTTP Header Linking,” November 2008.) [[NOTE: the link type 'describedby' is currently pending an IANA registration as a generic descriptor relationship]].

For example, the following HTTP response header returned with the HTTP representation of the resource http://example.com/resource/1:

  Link: <http://example.com/resource/1;descriptor>;
          rel="describedby"; type="application/xrd+xml"

defines a link between the resource and its descriptor located at http://example.com/resource/1;descriptor and is hinted to be using the XRD [XRD] (Wachob, G., Reed, D., Chasen, L., Tan, W., and S. Churchill, “Extensible Resource Identifier (XRI) Resolution V2.0,” .) document format.

The methods described in this memo all result in one or more link relationships with type 'describedby'. Two out of the three methods use existing link mechanisms as-is, by simply specifying the relationship type used. The third defines a new mechanism for dynamically constructing links using templates.



 TOC 

6.  Method Selection

Due to the wide range of use cases requiring resource descriptors, and the desire to reuse as much as possible, no single solution has been found to sufficiently cover the requirements for linking between the resource URI to the descriptor URI. A somewhat complete analysis of the potential methods considered and the reason for their inclusion or rejection can be found in [Discovery and HTTP] (Hammer-Lahav, E., “Discovery and HTTP,” .).

Obtaining the link information between the resource URI and the descriptor URI is accomplished using one of three methods. The criteria used to determine which methods a resource-provider SHOULD support and resource-consumer SHOULD attempt to use are based on a combination of factors:

The methods are listed in order of their applicability specialization, from the most restrictive method to the catch-all method. However, this order does not imply the order in which multiple applicable methods are attempted. The methods are:

Because different methods are more appropriate in different circumstances, all three methods described are considered equal and can be attempted in any order. To ensure interoperability, the following rules MUST be observed:



 TOC 

7.  Descriptor Location

The link relationship (with type 'describedby') used to identify the location of the descriptor document SHALL be obtained using one of the three methods: <LINK> element, HTTP Link header, or the Site-meta document.



 TOC 

7.1.  <LINK> Element

Resources with an HTML [W3C.REC‑html401‑19991224] (Jacobs, I., Hors, A., and D. Raggett, “HTML 4.01 Specification,” December 1999.) or an ATOM [RFC4287] (Nottingham, M., Ed. and R. Sayre, Ed., “The Atom Syndication Format,” December 2005.) representations MAY include a <LINK> element with the 'describedby' relationship type to link between the resource and its descriptor. A resource-consumer trying to obtain the location of the resource's descriptor MUST look for a <LINK> element with a 'rel' attribute value containing the 'describedby' relationship (a multiple relationship 'rel' attribute value is allowed and MUST be handled by the consumer). For example:

  <LINK href="http://example.com/descriptor"
    rel="describedby" type="application/powder+xml">

If the resource representation is obtained using HTTP, the resource-consumer MUST only use <LINK> elements if the HTTP response containing the representation carries a valid HTTP 200 response code. If any other response code is returned, the resource-consumer MUST continue as if no <LINK> elements were found.



 TOC 

7.2.  Link Header

Resources with an accessible HTTP representation MAY include a Link header in the HTTP response header as defined by [I‑D.nottingham‑http‑link‑header] (Nottingham, M., “Link Relations and HTTP Header Linking,” November 2008.) with a 'rel' parameter value set to 'describedby'. A resource-consumer trying to obtain the location of the resource's descriptor MUST look for a Link header with a 'rel' parameter value containing the 'describedby' relationship (a multiple relationship 'rel' parameter value is allowed and MUST be handled by the consumer). For example:

  Link: <http://example.com/descriptor>; rel="describedby";
          type="application/xrd+xml"

HTTP Link headers MUST only be used for the purpose of resource-discovery if the HTTP response containing the header was returned as a result of an HTTP GET or HEAD request, and carries one of the following HTTP response codes: 200, 303, and 401. 'describedby' Link headers served with any other HTTP response code MUST be ignored and the resource-consumer MUST continue as if no valid Link header was found.

If the HTTP response code is 303, any descriptor location is defined to be between the requested resource and the descriptor and not between the 'See other' resource indicated by a Location header. If the response code is 401, any descriptor location MUST only be used in association with obtaining access to the resource, which once obtained, must be queried again for its descriptor location which MAY be different from the unauthorized response.

When attempting to obtain the descriptor location, a resource-consumer MUST follow HTTP redirects 301 and 302 and consider the Link headers of the resource identified by the redirection as authoritative for the resource requested originally.



 TOC 

7.3.  Site-meta Document

Resources identified with a URI which contains a DNS-resolvable authority component MAY use the Site-meta document to provide a templatized map between the resource URI to the descriptor URI.

Site-meta defines a method for locating site-wide metadata for Web sites. Its primary objective is to avoid the need of further known-location solutions by creating one last such resource which can point to other resources. It can be considered a registry for "known-location" resource to avoid further intrusion into the site's naming authority.

In the context of resource-discovery, Site-meta offers a convenient location for storing information about how to map between resource URIs to their descriptor URIs. Site-meta provides a method for obtaining descriptor locations that does not depend on the availability of an HTTP representation (or 'see-other' information) for the resource. It can also, with an additional step, provide descriptor locations to URIs with schemes other than 'http' and 'https'.

The elements defined by Site-meta are meant to contain site-wide information. Unlike Link headers included in the HTTP response to the domain root resource (obtained via 'GET / HTTP/1.1' or 'HEAD / HTTP/1.1') which are specific to the root resource, links in Site-meta are between the abstract 'web site' entity and the linked resources. It is critical not to confuse the root resource of a domain authority with the abstract 'web site' entity described by Site-meta.

For this reason, any <meta> elements containing linked resources with relationship type 'describedby', identify the location of the abstract 'web site' entity description which by itself cannot be described using a URI. While this is a valid application of the 'describedby' relationship type, it is beyond the scope of this memo.

In order to provide descriptor location to individual resources, this memo defines an extension to the Site-meta schema for describing link templates. Using a template, a resource URI can be deconstructed and then reconstructed to form the URI of the descriptor location. For example, the following Site-meta document defines a template in which the resource URI is converted to the descriptor URI by appending ";about" to the URI:

  <metadata>
    <link-template template="{uri};about"
                      rel="describedby"
                      type="application/xrd+xml"
                      scheme=”mailto http” />
  </metadata>

The definition of the <link-template> element is identical to the definition of the Site-meta <meta> element with the exception that it cannot contain any value of child elements, the 'href' attribute is replaced by the 'template' attribute, and the addition of the 'scheme' attribute. The rest of the element attribute are identical and carry the same semantic meaning but between the individual resource used as an input to the template and the resulting descriptor URI, and do not relate to the abstract 'web site' entity.

The 'scheme' attribute serves as a filter indicating which URI scheme are meant to be transformed using the provided template. This optional attribute is meant to allow different handling of different URI schemes. The attribute value is a space separated list of lowercase scheme names.

The 'template' attribute defines a URI template with a very simple syntax. The attribute value is used to construct a valid URI by substituting the variable enclosed in {} with the value of the variable. In the example above, the 'uri' variable is replaced with the actual resource URI (the resource URI "http://example.com" replaces the "{uri}" string which results in "http://example.com;about"). If the variable name is prefixed by a '%' character, any character other than unreserved in variable value MUST be percent-encoded per [RFC3986] (Berners-Lee, T., Fielding, R., and L. Masinter, “Uniform Resource Identifier (URI): Generic Syntax,” January 2005.).

  unreserved  = ALPHA / DIGIT / "-" / "." / "_" / "~"

For example, the following template when used with the resource URI "http://example.com":

  <metadata>
    <link-template template="http://example.com?describe={%uri}"
                      rel="describedby"
                      type="application/xrd+xml" />
  </metadata>

produces the descriptor URI: "http://example.com?describe=http%3A%2F%2Fexample.com".

[[This initial draft only defines a single 'uri' variable. However, it is expected that future revision will define a larger template vocabulary which will be based on the URI structure definition and include: uri, scheme, authority, domain, port, path, query, fragment, and username (for 'mailto' URIs).]]

[[Site-meta is pending a major revision of its document format which is likely to replace its XML structure for a simpler text based structure. This memo will be revised as soon as the new Site-meta draft is published to reflect these changes. However, the changes are expected to only change the syntax, not its meaning.]]



 TOC 

8.  Descriptor Retrieval

Once the desired descriptor URI has been obtained, the descriptor document is obtained via an HTTP GET request to the identified URI. The resource-consumer MUST obey all HTTP 301 and 302 redirects and the descriptor document is considered valid only if contained within an HTTP response with the HTTP 200 response code.



 TOC 

9.  Caching

Resource-consumers MUST obey all HTTP caching headers and directives and discard any cached descriptor location as defined by the resource-provider. The ability to cache descriptor locations was a key requirement in selecting which methods to include in the resource-discovery workflow. It is critical that such information is cached as defined by HTTP.



 TOC 

10.  Security Considerations

[[Add security consideration with regard to performing discovery, and address any security issues if future revisions will include a method for trusted discovery (signed).]]



 TOC 

11.  IANA Considerations

This memo includes no request to IANA. The relationship type 'describedby' used by this memo is pending approval by the IANA and must be fully registered before this memo can become final. If for any reason the 'describedby' relationship type fails to register with the IANA, it is expected that this memo will define a new relationship type.



 TOC 

Appendix A.  Acknowledgments

[[Credit to XRDS-Simple, XRI, Yadis, Site-meta, etc contributors to be added before official draft publication.]]



 TOC 

12.  References



 TOC 

12.1. Normative References

[I-D.nottingham-http-link-header] Nottingham, M., “Link Relations and HTTP Header Linking,” draft-nottingham-http-link-header-03 (work in progress), November 2008 (TXT).
[I-D.nottingham-site-meta] Nottingham, M. and E. Hammer-Lahav, “draft-nottingham-site-meta-00,” draft-nottingham-site-meta-00 (work in progress), October 2008 (TXT).
[RFC2119] Bradner, S., “Key words for use in RFCs to Indicate Requirement Levels,” BCP 14, RFC 2119, March 1997 (TXT, HTML, XML).
[RFC2616] Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Masinter, L., Leach, P., and T. Berners-Lee, “Hypertext Transfer Protocol -- HTTP/1.1,” RFC 2616, June 1999 (TXT, PS, PDF, HTML, XML).
[RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, “Uniform Resource Identifier (URI): Generic Syntax,” STD 66, RFC 3986, January 2005 (TXT, HTML, XML).
[RFC4287] Nottingham, M., Ed. and R. Sayre, Ed., “The Atom Syndication Format,” RFC 4287, December 2005 (TXT, HTML, XML).
[W3C.REC-html401-19991224] Jacobs, I., Hors, A., and D. Raggett, “HTML 4.01 Specification,” World Wide Web Consortium Recommendation REC-html401-19991224, December 1999 (HTML).


 TOC 

12.2. Informative References

[Discovery and HTTP] Hammer-Lahav, E., “Discovery and HTTP” (HTML).
[POWDER] Archer, P., Ed., Smith, K., Ed., and A. Perego, Ed., “POWDER: Protocol for Web Description Resources” (HTML).
[XRD] Wachob, G., Reed, D., Chasen, L., Tan, W., and S. Churchill, “Extensible Resource Identifier (XRI) Resolution V2.0” (HTML, PDF).


 TOC 

Author's Address

  Eran Hammer-Lahav
Email:  eran@hueniverse.com
URI:  http://hueniverse.com


 TOC 

Full Copyright Statement

Intellectual Property



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]