[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [legalcitem-courts] Usecase--US Federal Courts draft
Frank responded to my Wednesday email with some really useful thoughts. I'm interested in hearing what others on the SC think about it all. For convenience, I'm replying to both emails from Frank in one response, so I moved the content around a bit.
On 12/11/2014 04:27 PM, Frank Bennett wrote:
In US practice, Bluebook rule 10.2 is what usually governs. The AALL Universal Citation Guide, 3d, in its rule 101, says case names should conform to it, or to ALWD Manual rule 12.2. The Univ. of Chicago's Maroon Book's rule 4.2 is similar, only MUCH simpler. In short, there is a formula. I like the idea of a resolver. Sort of like an authority record in cataloging.On Thu, Dec 11, 2014 at 7:31 AM, Frank Bennett <firstname.lastname@example.org> wrote:John, Looks like a good start! This raises some questions about scope, I think. (The questions themselves are at the end.) Printed citation forms identify the resource, but involve a step of interpretation for several of the elements. We would need to know those should be cast in the electronic representation of the reference. Taking the first example ... (1) case name string Is there a formula or a set of constraints for deriving this from the header information in a judgment? If it must be uniform across all citations to the case, it should either be possible to derive it programmatically, or there should be a canonical version of the case name somewhere that can be acquired via a resolver, using other elements that uniquely identify the case (I guess that's the middle layer in FRBR).
Not always an integer. Sometimes reporter volumes are issued in parts, e.g., 245A, 245B, 245C, etc. This usually happens in the NRS when volumes are still in prep, and temporary paperback volumes are released piecemeal.(2) volume number This would be an integer for this category of citation. Is it safe to specify it as an integer, or are there exceptions that would require more flexibility?
I think a canonical list of full reporter names and abbreviations would be the way to go. I don't think it necessary to break out the series....treat them as separate entries.(3) reporter abbreviation As the LRR shows, there is a lot of variation in reporter abbreviations used in the wild (spacing, punctuation, abbreviations). If it is used as an element in an electronic representation of the reference, the abbreviation will need to be consistent across all references. How is it to be derived? The choice would seem to be between a canonical list of reporters and corresponding abbreviations, or the full name of the reporter. A secondary consideration would be whether the elements embedded in a reporter abbreviation (journal + series) should be broken out and represented separately.
In US practice, I have never seen a non-integer page number for a case (roman numbers for intro parts of reporter, but not for cases). I think we could type it as an integer.(4) first page number This seems an integer. Same question about constraints as for volume number.
In neutral citations, the pinpoint is usually to a paragraph number, not a page.(5) pinpoint page numbers Pinpoints can include references to page numbers, note numbers, and possibly other document elements. Should these elements be specified, or is a dumb string sufficient?
That is a really good question. I had been assuming full description, but your comment is making me rethink that.(6) circuit justice if applicable This raises a question of whether the spec is aimed at full description of the resource, or at pinning down the essential information needed to unambiguously identify the resource. If the latter, this would not be needed.
The year of decision is the year of publication by the court by definition. The "publication" date is not the date the reporter was published. An interesting question is how to deal with changes made by the court after the decision is published, but before the print official reporter hits the streets. (SCOTUS is infamous for this : http://www.nytimes.com/2014/05/25/us/final-word-on-us-law-isnt-supreme-court-keeps-editing.html ) I suppose that type of info can go in the parenthetical string at the end of the citation.(7) year of decision In this citation form, are the year of decision and the year of publication always aligned?
See my comments under 6 & 7 above.(8) parenthetical information such as judge, type of document, weight of authority This raises the same question as (6).
Again with the great question....I think it should be explicit. It seems to me that all court citations should be similar, to make parsing easier among other reasons.(9) the court SCOTUS citations are to a dedicated reporter, so the court is implicit. Should this be made explicit in an electronic citation? Alternatively, should reporters be made a separate domain in the specification, so that such information can be attached to each?
This is the crux. I had been assuming (a) or maybe (b). So I went to the TechSC's latest draft and reread it:*** I guess a threshold question is the scope of the spec: (a) Does it aim to express the elements of all existing printed citations (this is also Brian's question, I think); or (b) Does it aim to specify only the elements of all printed citations needed to uniquely identify the resource; or (c) Does it aim to specific only the minimum elements (or combination of elements) needed to uniquely identify the resource? If the aim is the enrichment of document content with RDF-style links to meaningful text elements, that suggests (a). If the aim is to support parsers capable to linking specifically to cases, that suggests (b) -- this is the aim of the CourtListener database from which the LRR is derived. If the aim is to provide guidance for the construction of resolvers and data to feed to them, that suggests (c). My understanding is that this is what we're aiming for, but I could be wrong. Frank
" It is NOT the purpose of this TC to establish a proposed syntax for citations."
" The relevant task of every subcommittee is therefore to identify types and roles of FRBR entities in their document classes, and classify features according to different levels of a layered model of the document."
"Also, subcommittees should also identify how the references to documents of their classes are impacted by the layered view of documents.... It will be a rare case indeed the citation (and therefore the need for a reference) pointing to an FRBR Item (i.e., to a specific file on a specific computer at a specific IP address) or to an FRBR Manifestation (i.e., to a specific characterization in a specific file format of a document). Most frequently a citation points to a legal document existing on a different conceptual layer and in a different level of reality than the physical copies it is embodied by, or by the data formats in which each copy is expressed. More frequently, therefore the citation will identify a document at a more abstract level, e.g., an FRBR _expression_ when the citation is to a specific version or variant of the document, or an FRBR Work when the citation is to all these versions or variants, or to the one that is identified through a possibly complex contextualization process. In these cases, therefore, the citation MUST be converted to a reference to a Work or an _expression_, which is resolved into the physical Locator of the Item only when needed, therefore separating the legal aspects of the identification of the correct version and variant of a document from the technical aspects of the dereferencing of a resource on the World Wide Web."
So what info does our part of the spec need to convey? Our citations will be identifying cases/other court docs at the FRBR Work level and at the FRBR _expression_ level, methinks. A "print" citation would be on the _expression_ level, no?
Where does pinpoint (page or para numbers) fit into this? The citation is to the work as a whole, but also (usually, or mostly) to specific language in that work.To follow up on the scope issue, if the aim (or one aim) is to specify minimal data that can be derived from the text, for the purpose of generating a key for submission to a resolver, would this work for cases: type (decision) court (id) docket number (string) decision date (date) For the "court" element, an ID would be preferable to the court name, since the latter can change without any change to the institution proper. Resolution would return further details (cites for each reporting service carrying the case, with case name, etc.); the suggestion above is only for the "handle" that uniquely identifies the case. (Whether this makes any sense will depend on the scope of the endeavor, of course.)
What do you all think?
-- John Quentin Heywood email@example.com