security-services message

Subject: composition of AssertionID (Issue: DS-4-04: URIs for Assertion IDs)
From: Jeff Hodges <jhodges@oblix.com>
To: security-services@lists.oasis-open.org
Date: Tue, 05 Jun 2001 09:32:51 -0700
I took an action as noted in "focus group" minutes by Eve..
http://lists.oasis-open.org/archives/security-services/200105/msg00139.html

> NEW ACTION: Jeff to send out email about possible URI constraints and 
> identity definitions we should consider imposing in the case of SAML's 
> unique identifiers.

Below's what I came up with to kick off this discussion. It's organized like 
so..

  Background...
  Analysis...
  Thoughts...
  Further thoughts...
  Notes and References...

There's many subtleties to all of this, and I've undoubtedly either/both 
overlooked some or made too much of some. It's taken a while to find and wade 
thru (or skim) applicable docs, and then write this up -- much longer than I'd 
originally thought, so apologies for its tardiness. It's time to have other 
eyes'n'minds take a look at this.

JeffH
-----


Background...

From the focus group minutes [1]:
> >- URIsForAssertionIDs: What are the pros and cons?  What other
> >  methods are there?
> 
> DS-4-04: URIs for Assertion IDs: (still open after today)
> 
> Eve, with help from Dave, gave a short tutorial on the problems with
> URI  identity in XML namespace names.

There followed a brief discussion in which we touched upon various aspects of 
this problem space. We terminated the discussion upon issuing the above "new 
action". (the discussion as-documented in the aforementioned minutes is 
attached below for reference [1])

Further background, in the form of the specs for AssertionID and Issuer from 
draft-sstc-core-07 are excerpted at [2].

Relevant, recent discussion on security-services@lists.oasis-open.org...

Hal said in 
  http://lists.oasis-open.org/archives/security-services/200105/msg00146.html

> 5. In 1.3.1 I don't understand the intended purpose of AssertionID. 


PHB replied in
  http://lists.oasis-open.org/archives/security-services/200105/msg00159.html

> The AssertionID provides a unique reference for the assertion. ...

> Within SAML 1.0 the principle use of an AssertionID would be to allow
> one assertion to reference another (see previous Tim discussion) thus
> allowing statements of the form `this assertion was constructed from
> that assertion'.

> The principle use of the AssertionID however would be in systems built
> around SAML, they provide the basis for audit and accountability for
> example. If a system is built that allows for second order logic
> (assertions may be true or false and other assertions may make
> statements about validity (c.f. TASS meta-assertions)), then an
> assertionID is essential. 



Analysis...

The stated purpose of the AssertionID element is as an "assertion unique 
identifier" [2]. The stated syntax of this identifier is a URI [3]. Implicit 
in this line of thinking is a notion that URIs may be created (aka "minted") 
in a globally decentralized, non-colliding fashion due to the properties of 
the URI "space" [4].

The following is stated in [2] about AssertionID..

> The URI is used as a name for the assertion and not as a locator. It
> is only necessary to  ensure that no two assertions share the same
> identifier. Provision of a service to resolve  an identifier into an
> assertion is not a requirement. 


Also, as far as I can tell, [2] postulates (in section 1.3) that a requester 
need supply only an assertionID in a SAMLQuery in order to obtain an 
assertion. It does not make clear any distinction between newly minting an 
assertion and retrieving an already-existing one.

Thus it seems that there is a tacit assumption in [2] that an assertion may be 
uniquely identified and minted/retrieved using only an assertionID, regardless 
of the quote above.

So it seems that an assertionID is being asked to both..

  A. identify, globally and uniquely, assertions;
  B. provide at least a hint about where to direct requests for minting
     or retrieving assertions. 

..but again, this is to a fair degree inferred from a rough, incomplete, draft 
spec ([2]).

Additionally, there are many subtleties to using URIs as identifiers rather 
than straight-ahead resoure locators. See the minutes of the "Future of URIs" 
Birds of the Feather session held at the 50th IETF meeting [11],




Thoughts...

It is an arguably good design principle to separate functions between various 
data items such that their roles in life are unambiguous.

[2] already has an "Issuer" assertion element. If identifying assertions is 
predicated on using the tuple "assertionID, Issuer", and some method for 
guaranteeing non-colliding Issuer names is used (e.g. DNS domain names, and 
things built upon them), then the assertionID can be quite simple, e.g. an 
integer (as is done in PKIX [10]).

In using the "assertionID, Issuer" tuple to identify assertions, and also 
provide guidance about where to go to make requests about or for them, the 
role of the Issuer element may arguably be (too) overloaded. E.g. if the 
overall SAML design calls for assertions to (perhaps optionally) specify 
within their structure where a receiver of an assertion may go to make queries 
about the assertion, then the requirements for persistence and 
location-independence for that particular identifier may conflict with the 
requirements of simply globally and uniquely (and perhaps persistently) 
identifying the Issuer security domain.

So it may be the case that to..

  case 1) globally uniquely identify an assertion one needs the combination of 
"assertionID, Issuer",

  case 2) uniquely identify assertions in the context of a given security 
domain, one needs only "assertionID" (it doesn't need to be disambiguated from 
assertions from other security domains; in this case the assertionID starts to 
look a lot like a serial number),

  case 3) one needs to cover either of the prior cases, and also needs to 
specify where to go (and "how" to "go") to make requests to the security 
domain in question. I.e...

  <assertionID>123123123123</assertionID>
  <Issuer>some-issuer-identifier</Issuer>  -- perhaps optional
  <Source>saml://example.org/send-yer-SAML-based-requests-here   -- optional
  </Source>

Tho there are good arguments for not making Issuer optional (case 2), thus the 
overall set of identifying information might be structured something like 
this..

  <assertionID>
    <serialNumber>123123123123</serialNumber>
    <Issuer>some-issuer-identifier</Issuer>
  </assertionID>
  <Source>saml://example.org/send-yer-SAML-based-requests-here   -- optional
  </Source>



Further thoughts...

There's tons of subtle-but-important details in all of this that need to be 
considered in nailing down a design. Some of them are..

D1. if one uses a URL or URL-like flavor of URI as an identifier, we need to 
specify how comparisons between said identifier and other blobs of data are 
made. [3] details some of these subtleties in sections 1.5 and 2.1. The 
lowest-common-denominator option of specifying that such comparisons are made 
by performing a byte-by-byte octet string comparison will only technically 
work if certain restrictions are specified for the URI-based values. The SAML 
specs may need to consider/specify/incorporate one or more or all of..

  * charset restrictions for all or some SAML elements,
  * charset specifications, and bounds on said specifications, for SAML
    elements whose value syntaxes are URI [3],
  * charset(s) specified/allowed by underlying protocols and interaction 
    thereof with the prior items in this list,
  * [perhaps others/more]

Of note is "Character Model for the World Wide Web 1.0" [14] which defines an 
algorithm called "String Identity matching" (in section 6), which has 
implications for the above. (it also has implications for SAML in general, see 
D6).

D1.1. See also [16] [17] for further musing about internationalization for URI 
and other identifiers.

D1.2. See also "Considerations for URI and FQDN Protocol Parameters" [18] for 
further musings about using DNS domain names and/or URI as identifiers in 
protocol elements.

D1.3. If URI are used as identifiers in protocol elements, software modules 
that handle them (this includes people as a boundary condition ;) may wonder 
just what the heck their semantics are, because their semantics can be so 
varied. "URI Relationship Discovery via RESCAP" [19] touches upon and 
enumerates these questions, as well as sketch a protocol-based approach that 
specifies a service providing such info. Additionally, the more recent I-D, 
"URI Resolution using the Dynamic Delegation Discovery System" [20], also 
provides some relevant background info.

D1.4. Registration issues -- URI (nee URL) schemes should be registered, same 
with URN namespaces. See [9] for pointers to relevant RFCs on how to 
accomplish such registrations.


D2. some-issuer-identifier -- should this simply be a DNS 
fully-qualified-domain-name? Should it be a URN [6]? Should it be something 
else?

D3. use of URNs -- URNs have semantics of persistence and 
location-independence. Their use may or may not be appropriate in the context 
of SAML assertions depending upon the semantics of the thing they're being 
called upon to identify [6] [7]. E.g. it is questionable to use a URN to 
identity a given non-persistent, indeed likely ephemeral, artifact such as an 
instantiation of a SAML assertion. However, it is

D4. if URNs are used, what namespace identifiers are appropriate? Any? Only a 
selected one(s)? Formal or informal? [7] [12]

D5. the DOI work [13] is likely not appropriate for SAML's purposes due to 
that effort's Intellectual Property emphasis and also because of the implied 
(required?) dependency upon the Handle System. The latter is an nascent, 
intended-to-be-scalable-to-the-Internet, naming and name resolution system 
[13] (I haven't yet read the internet-drafts in detail).

D6. The emergent "Character Model for the World Wide Web 1.0" MAY have various 
implications for SAML's specification, beyond that noted in D1.

D7. IMHO, "tag:" URIs [15] are not appropriate for our problem space, given 
their present specification, but reading about them and the discussion thereof 
on the uri@w3.org list is educational.

D9. If an artifact is not persistent, then it's identifier may be reused under 
certain conditions. Something to keep in mind and think about.


Notes and References...

[1] URIsForAssertionIDs discussion, from Focus subgroup concall, 22-May-2001:

http://lists.oasis-open.org/archives/security-services/200105/msg00139.html

>- URIsForAssertionIDs: What are the pros and cons?  What other methods
>    are there?

DS-4-04: URIs for Assertion IDs: (still open after today)

Eve, with help from Dave, gave a short tutorial on the problems with URI 
identity in XML namespace names.

Thomas: The DOI people are working on this general 
problem.  (http://www.doi.org, http://www.handle.net/)

Eve: It would be acceptable to use URIs if we apply constraints.  E.g., 
they should be absolute (or even should be absolute URNs) and we should 
define what equality means.  Dave: Solving the "whole URI problem" is way 
bigger than SAML's scope.

Jeff: There was recently an IETF BOF on the future of URIs, and W3C was 
investigating these issues, but nothing has really happened.

Eve: See W3C's Character Model spec for recommendations on normalization 
and internationalized URIs.  (http://www.w3.org/TR/charmod/)

Dave: Cautioned that we have to be concerned with real-world websites and 
their behavior, which is not precisely the same as the standards.  For 
example, http://www.jamcracker.com and http://www.jamcracker.com/index.html 
point to the same resource, but how can people know that?  BobB: Aliases, 
symbolic links, etc. are a problem if you have policies on different 
aliases that conflict.

Hal: We can take a hard line on URIs for assertion IDs, but for resources, 
we may have to deal with the vagaries of real-world URIs.

Evan: URIs are opaque strings, and XML makes data's structure more transparent.

Hal: There will probably be more cases than just AssertionID where 
identifiers will have properties of uniqueness (RequestID?) and are just 
"internal to SAML."  We should pull out the description of these properties 
into a separate section and have it referred to from the various sections.

Hal: We should register a new URI scheme, e.g. "saml:"  Thomas: We could 
just use URNs and have the same effect.  Jeff: It's pretty easy to register 
a new scheme with IANA.  (http://www.ietf.org/rfc/rfc2717.txt)   Eve: It's 
surprisingly hard to register a new URN namespace 
(http://www.ietf.org/rfc/rfc2611.txt)

NEW ACTION: Jeff to send out email about possible URI constraints and 
identity definitions we should consider imposing in the case of SAML's 
unique identifiers.



[2] from draft-sstc-core-07: 
http://www.oasis-open.org/committees/security/docs/draft-sstc-core-07.pdf

> 1.4.2 Element <AssertionID> 
> 
> Each assertion MUST specify exactly one unique assertion identifier.
> All identifiers are  encoded as a Uniform Resource Identifier (URI)
> and are specified in full (use of relative  identifiers is not
> permitted). 
> 
> The URI is used as a name for the assertion and not as a locator. It
> is only necessary to  ensure that no two assertions share the same
> identifier. Provision of a service to resolve  an identifier into an
> assertion is not a requirement. 
>
> The following schema defines the <AssertionID> element: 
> 
> <element name="AssertionID" type="string"/> 
> 
> 
> 1.4.3 Element <Issuer> 
> 
> The Issuer element specifies the issuer of the assertion by means of a
> URI. It is defined  by the following XML schema: 
> 
> The following schema defines the <Issuer> element: 
> 
> <element name="Issuer" type="string"/> 



[3] Uniform Resource Identifiers (URI): Generic Syntax
http://www.ietf.org/rfc/rfc2396.txt



[4] URIs encompass both URLs and URNs. The former [5] often (but not always) 
depend upon the Domain Name System (DNS) namespace, which enables the 
capability to mint globally unique URLs in a decentalized fashion. The latter 
[6] define a hierarchical namespace that is DNS-independent but centrally 
mediated [7] in order to provide "location independent identification of a 
resource, as well as longevity of reference".

This picture is from [8]...
         _______________________________________________________
        |         ________________                              |
        |        |  ftp:          |                             |
        |        |  gopher:       |                             |
        |        |  http:       __|____________                 |
        |        |  etc        |  |  urn:      |                |
        |        |_____________|__|            |                |
        |                URLs  |               |                |
        |                      |_______________|                |
        |                             URNs                      |
        |_______________________________________________________|
                               URIs

URIs, URLs, and URNs are described by a plethora of documents. An attempt to 
tie them all together is given in [9].




[5] Uniform Resource Locators (URL)
http://www.ietf.org/rfc/rfc1738.txt


[6] URN Syntax
http://www.ietf.org/rfc/rfc2141.txt


[7] URN Namespace Definition Mechanisms
http://www.ietf.org/rfc/rfc2611.txt


[8] Naming and Addressing: URIs, URLs, ...
http://www.w3.org/Addressing/


[9] Uniform Resource Identifiers: Comprehensive Standard
http://www.ietf.org/internet-drafts/draft-daigle-uri-std-01.txt


[10] PKIX Certificate and CRL Profile
http://www.ietf.org/rfc/rfc2459.txt


[11] Future of Uniform Resource Identifiers BOF (furi) 
[50th IETF, Minneapolis MN, Mar-2001]
http://www.ietf.org/proceedings/01mar/ietf50-39.htm#TopOfPage


[12] URI.NET -- a clearing house for information on URIs in general and on 
specific URI schemes and software
http://www.uri.net/


[13] Digital Object Identifiers, The Handle System
http://www.doi.org, http://www.handle.net/


[14] Character Model for the World Wide Web 1.0
http://www.w3.org/TR/charmod/


[15] "Tag" URI Scheme
http://www.taguri.org/
see also the thread on uri list "Proposal: 'tag' URIs", from  Tim Kindberg 
<timothy@hpl.hp.com>...
  http://lists.w3.org/Archives/Public/uri/2001Apr/0013.html

http://www.taguri.org/2001-04-26/draft-kindberg-tag-uri-00.txt


[16] Internationalization: URIs and other identifiers
http://www.w3.org/International/O-URL-and-ident.html


[17] Internationalized Resource Identifiers (IRI)
http://www.ietf.org/internet-drafts/draft-masinter-url-i18n-07.txt             

                     
[18] Considerations for URI and FQDN Protocol Parameters
http://www.ietf.org/internet-drafts/draft-eastlake-uri-fqdn-param-00.txt


[19] URI Relationship Discovery via RESCAP
http://www.ietf.org/internet-drafts/draft-mealling-uri-rdf-00.txt 


[20] URI Resolution using the Dynamic Delegation Discovery System
http://www.ietf.org/internet-drafts/draft-ietf-urn-uri-res-ddds-03.txt



---
end