OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

legalcitem-technical message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Who's easy on us using a US use case?


Dear all, 

last call's discussion let me uneasy on many issues that I thought could be straightened more easily. 

To help with the future discussions, I propose we work on a specific use case. Since Grant mentioned Obamacare, which has a really good Wikipedia page to peruse, I will use it as an example. 

I propose that, both in our discussions and in comparing approaches and facts, we use this document as example. 

Obamacare
---------
This is what I know about Obamacare. Most of the information come from its page on Wikipedia (http://en.wikipedia.org/wiki/Patient_Protection_and_Affordable_Care_Act): 

Facts
-----
Obamacare is an US act titled "An act entitled The Patient Protection and Affordable Care Act", short title: "The Patient Protection and Affordable Care Act", acronyms "PPACA" and "ACA", nicknames: "Affordable Care Act", "Health Insurance Reform", "Healthcare Reform", "Obamacare". It is in English. I looked for a version in Spanish or in another language, but could not find any, even non authoritative (although I found fact sheets and summaries in Spanish, Chinese, Vietnamese, etc.). It was signed into law by president Barack Obama on January 23rd, 2010 and became effective the following day. 

It was numbered Public Law #148 of the 111th Congress, and was published as pages 119 through 1025 of volume 124 of the Statutes at large. It has been codified in a scattered form inside title 26 (Internal revenue Code) and title 42 of the US Code. It is the enactment of the House of Representatives' Bill # 3590 introduced to the House by Charles Rangel (member of the Democratic Party for the state of New York) on September 17, 2009. 

It has been amended several times including by the Health Care and Education Reconciliation Act of 2010, Public Law #152 of the 111th Congress, pages 1029 through 1084 of the 124th volume of the Statutes at Large, signed into law on March 23, 2010 and effective the following day.  

I have found several instances of it in various formats. Some of them are, for instance: 
[1] http://www.gpo.gov/fdsys/pkg/PLAW-111publ148/pdf/PLAW-111publ148.pdf in PDF, 
[2] http://housedocs.house.gov/energycommerce/ppacacon.pdf of the consolidated version after Public Law 111-152, as prepared by the Office of the Legislative Council, 
[3] http://democrats.senate.gov/pdfs/reform/patient-protection-affordable-care-act-as-passed.pdf as PDF,
[4] http://beta.congress.gov/111/plaws/publ148/PLAW-111publ148.htm (plain text masquerading as HTML),
[5] http://beta.congress.gov/111/bills/hr3590/BILLS-111hr3590enr.pdf (as PDF), etc.
[6] http://www.autismspeaks.org/images/advocacy/PPACA.pdf,
[7] http://en.wikisource.org/wiki/Patient_Protection_and_Affordable_Care_Act and other pages (as HTML),
[8] http://www.complianceweek.com/s/documents/PPACAText.pdf (as PDF), etc. 

Some of them seem to be copies of the same original file in different locations, e.g., 5, 6 and 8, but I did not check thoroughly. 

Analysis of features
--------------------
Locations: There are several locations where I can find a copy of the document. 
Formats: At least four different formats I could identify: two types of PDF, and two types of HTML.
Format authors: each of the two PDF formats and each of the two HTML formats has been created in a different way by a different author in a different moment
Versions: there are at least two versions of the document, the original version and the consolidation of the amendments introduced by Health Care and Education Reconciliation Act of 2010.
Language: only English
Volume of the Statute: 124
Starting page of the volume of the Statute: 119
Ending page of the volume of the Statute: 1025
Congress: 111th
Public Law number of the corresponding congress: 148
Full title: An act entitled The Patient Protection and Affordable Care Act
Short title: The Patient Protection and Affordable Care Act
Acronyms: PPACA and ACA
Popular names or nicknames: "Affordable Care Act", "Health Insurance Reform", "Healthcare Reform", "Obamacare"
Date of effectivity: January 24th, 2010 
Date of signature: January 23rd, 2010 
Date of effectivity of amended version: 24 March 2010
Signee: Barack Obama, President of the United States
Type of document: act
Country: United States of America (USA)
Enactment of: 
   type of document: bill
   house of first introduction: House of Representatives
   Introduction date: September 17th, 2009
   Internal number: 3590
   introduced by: 
      name: Charles Rangel
      party: Democratic Party
      representing: New York


Each of these pairs I call a "feature". If we split them according to the FRBR levels, we find that: 

Item: Locators
Manifestation: Format and Format author
Expression: Version Date, Language,
Work: all the others.

Identifiers
-----------
All Locators are obviously identifiers, but they identify a specific file on a specific machine, rather than a document. For instance, I am pretty confident that 5, 6 and 8 are identical, but they have different locations. 

If we want Work level, Expression Level and Manifestation Level identifiers, we need to build them with the features we have. There are several combinations of features that give unicity, but some of them are more "natural" than others: for instance, country + congress# + plaw#, or country + volume + starting page, or country + acronym, or country + short title. 

There are no reasons to accept identifiers using some features and discard others using other features. I propose that it is possible to create multiple identifiers for documents, provided that they are univocal, using a wide variety of features. 

For instance, Akoma Ntoso accepts any work-level identifier organized as such: 
/[country]/[doctype]/[doc-subtype]/[date]/[numberOrString]/

Therefore, using the syntax of the Akoma Ntoso Naming Convention, each of the following is a valid Work-level identifier: 

a) /us/act/2010/111-148/                                   and      /us/act/2010-01-24/111-148/
b) /us/act/2010/124Stat119/                                and      /us/act/2010-01-24/124Stat119/
c) /us/act/2010/124Stat119-1025/                           and      /us/act/2010-01-24/124Stat119-1025/
d) /us/act/2010/ACA/                                       and      /us/act/2010-01-24/ACA/
e) /us/act/2010/PPACA/                                     and      /us/act/2010-01-24/PPACA/
f) /us/act/2010/ThePatientProtectionAndAffordableCareAct/  and      /us/act/2010-01-24/ThePatientProtectionAndAffordableCareAct/
g) /us/act/2010/ObamaCare/                                 and      /us/act/2010-01-24/ObamaCare/

etc. 

Akoma Ntoso adds language, version date (or a simple @ for the original version) and consolidation author to any work-level id. Each of the following is therefore a valid Expression-level identifier: 

h) [WORK-LEVEL-IDENTIFIER]/en@                  -- original version
i) [WORK-LEVEL-IDENTIFIER]/en@2010-01-24        -- original version
j) [WORK-LEVEL-IDENTIFIER]/en@2010-03-24        -- amended version
k) [WORK-LEVEL-IDENTIFIER]/en@2010-03-24/OLC    -- amended version as consolidated by the Office for Legislative Council

Akoma Ntoso adds format and manifestation author to any expression-level id. Each of the following is therefore a valid Manifestation-level identifier: 

l) [EXPRESSION-LEVEL-IDENTIFIER].pdf            -- a PDF version
m) [EXPRESSION-LEVEL-IDENTIFIER].html           -- an HTML version
n) [EXPRESSION-LEVEL-IDENTIFIER]/GPO.pdf        -- the PDF version created by the Government Printing Office
n) [EXPRESSION-LEVEL-IDENTIFIER]/GPO.html       -- the HTML version created by the Government Printing Office

In my mind, the resolution is composed of two steps: completion (a higher level identifier is completed of feature values to get to a full manifestation-level identifier) and resolution (a manifestation-level identifier is mapped onto a physical item-level URL).  

Finally, I still do not see the point with the distinction between data and metadata that came out in our discussion. Can someone, using these data, make me an example on which is data and which is metadata, and why?

Thanks

Fabio

--

Fabio Vitali                            Tiger got to hunt, bird got to fly,
Dept. of Computer Science        Man got to sit and wonder "Why, why, why?'
Univ. of Bologna  ITALY               Tiger got to sleep, bird got to land,
phone:  +39 051 2094872              Man got to tell himself he understand.
e-mail: fabio@cs.unibo.it         Kurt Vonnegut (1922-2007), "Cat's cradle"
http://vitali.web.cs.unibo.it/






[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]