[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [office-metadata] Atom and document/feed IRIs
Bruce, Interesting. I remember in the newsgroup days that all messages had unique IDs (well, sort of, they were actually recycled after several years) which is how they did threading. I am definitely leaning towards the IRI approach, mostly because it makes the subject stable. Hope you are having a great day! Patrick Bruce D'Arcus wrote: > For comparison, here's how Atom handles feed or document identification: > > First, an informal primer of sorts: > > <http://diveintomark.org/archives/2004/05/28/howto-atom-id> > > Second, the spec: > >> The "atom:id" element conveys a permanent, universally unique >> identifier for an entry or feed. >> >> atomId = element atom:id { >> atomCommonAttributes, >> (atomUri) >> } >> >> Its content MUST be an IRI, as defined by [RFC3987]. Note that the >> definition of "IRI" excludes relative references. Though the IRI >> might use a dereferencable scheme, Atom Processors MUST NOT assume it >> can be dereferenced. >> >> >> >> Nottingham & Sayre Standards Track [Page 19] >> >> >> RFC 4287 Atom Format December 2005 >> >> >> When an Atom Document is relocated, migrated, syndicated, >> republished, exported, or imported, the content of its atom:id >> element MUST NOT change. Put another way, an atom:id element >> pertains to all instantiations of a particular Atom entry or feed; >> revisions retain the same content in their atom:id elements. It is >> suggested that the atom:id element be stored along with the >> associated resource. >> >> The content of an atom:id element MUST be created in a way that >> assures uniqueness. >> Because of the risk of confusion between IRIs that would be >> equivalent if they were mapped to URIs and dereferenced, the >> following normalization strategy SHOULD be applied when generating >> atom:id elements: >> >> o Provide the scheme in lowercase characters. >> o Provide the host, if any, in lowercase characters. >> o Only perform percent-encoding where it is essential. >> o Use uppercase A through F characters when percent-encoding. >> o Prevent dot-segments from appearing in paths. >> o For schemes that define a default authority, use an empty >> authority if the default is desired. >> o For schemes that define an empty path to be equivalent to a path >> of "/", use "/". >> o For schemes that define a port, use an empty port if the default >> is desired. >> o Preserve empty fragment identifiers and queries. >> o Ensure that all components of the IRI are appropriately character >> normalized, e.g., by using NFC or NFKC. >> > > > > -- Patrick Durusau Patrick@Durusau.net Chair, V1 - Text Processing: Office and Publishing Systems Interface Co-Editor, ISO 13250, Topic Maps -- Reference Model Member, Text Encoding Initiative Board of Directors, 2003-2005 Topic Maps: Human, not artificial, intelligence at work!
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]