OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

legalcitem-technical message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [legalcitem-technical] List of issues to discuss during today's call (longish)


On Wed, Apr 30, 2014 at 11:30 PM, Chet Ensign <chet.ensign@oasis-open.org> wrote:
Hi Frank, 

I wrestle with this too. I saw enough of state law oddities within the U.S. to believe that capturing all the applicable variations in labeling is possible. 

Is "possible" a typo?
 

As I am re-reading the wiki page and the notes, one idea that occurs to me - and that may help us cut through some of these sorts of quandaries more quickly - is if we think of the objective of the specification as being to develop a standard approach to producing a reference from a citation so as to enable the development of resolvers. 

That's how I see it too. But the guidelines suggest that the field content (say, the name of a Ministry) fed to the resolver should follow the strict requirements of a property list. I think systems will often end up breaking that rule when it is laid down. Perhaps that's not something that the spec needs to worry about, but it seems that a property list has value, and that a concept or process that allows for extending it over time would be beneficial.
 

Stating it that way implies that (a) we don't have anything to say about citations themselves.

Absolutely. Printed citations are the dumbest of dumb strings.
 
They are going to be constructed following whatever rules or practices a given author may be following. (b) Constructing the reference may require human intervention because the citation is vague, incomplete, or incorporates something novel and the standard we develop must take that into account.

Don't underestimate the capabilities of CSL. :-) But yes, the purpose of citation engines is to save time for authors, not to replace their copy editors.
 
(c) We don't have anything to say about the resolvers. People should be able to construct resolvers in whatever clever ways they can dream up based on the information set that they able to derive from the reference. 

/chet


On Wed, Apr 30, 2014 at 10:17 AM, Frank Bennett <biercenator@gmail.com> wrote:
ご無沙汰しております。 Late to the party here. It's been a wild week.

Just one comment below. I have a weak sense of audience when it comes to technical matters. If I'm off the mark, please forgive.

On Wed, Apr 30, 2014 at 9:18 PM, Fabio Vitali <fabio@cs.unibo.it> wrote:
Dear all,

I summarized and listed the issues raised in the mailing list in the past week. Thanks to all the contributors. For each issue I provide a discussion and a proposed solution.

To allow you to check on the effect of the proposed solutions on the whole text, I pre-applied them to the first draft, creating a proposed second draft at

https://wiki.oasis-open.org/legalcitem/FundamentalRequirements2nd

This will be the core of the discussion of today's call. I sincerely hope we can reach an agreement today and vote for its delivery to the other SCs.

See you later

Fabio

--

1) Document fragments
From: Chet Ensign (at the last conference call)
Issue: The text discussed documents, citations to documents and identifiers of documents. But fragments of documents are cited just as much or even more than whole documents. Change the text to reflect this.
Discussion: none whatsoever: Chet is right.
Proposal:
- Modify
> citation: an explicit, plain-text, human-readable mention of a legal text as found in another text, providing sufficient detail for a reader with appropriate legal training to identify with precision the relevant text
into
"citation: an explicit, plain-text, human-readable mention of a legal document (or a fragment thereof) as found in another text, providing sufficient detail for a reader with appropriate legal training to identify with precision the relevant text"
- Modify
> identifier: a string univocally associated to a document to identify it
into
"identifier: a string univocally associated to a document (or a fragment thereof) to identify it"

2) Open set of features and feature values
From: Catherine Tabone on April 25th, in message https://www.oasis-open.org/apps/org/workgroup/legalcitem-technical/email/archives/201404/msg00009.html
Issue: "we've repeatedly found that however hard you try to list identifying features for documents at a later date we would always end-up unearthing some legacy document that for some obscure historical reason followed a different rule or a new class of document would be created [...] we will need some mechanism to update the standard as necessary or to build in some flexibility for unexpected requirements"
Discussion: for subcommittees to fully ascertain and describe document types and other features is a Sisyphean task and we cannot expect them to carry it out fully. Not only every country has its own bottomless pit of strange, legacy, historical document types with only local, historical or jurisdictional appeal that are hard to fully capture and describe, but there are many countries in the world and I doubt we will be able to find at a whim experts in the legislation of Zimbabwe, Nepal, Tonga Island, etc., to come to us and fill in details for their peculiarities. Therefore I suggest we identify facts about features, the most important of which is whether the values that a specific feature may assume are taken from a closed list (and we ought to produce such a list) or from an open vocabulary. I would assume therefore that the specific list of document types is a natural candidate for an open vocabulary, probably produce rules as to the determination of the corresponding string value should be determined from the original name (e.g., by removing spaces, special characters and stop words, or by adopting a locally enforced abbreviation, so that, for instance, a "Decreto del Presidente del Consiglio dei Ministri" in Italy could be valued as "DecretoPresidenteConsiglioMinistri", or "dpcm", as determined by a local authority or a local resolver.
Proposal:
- Add to
> Another important output of the work of each SC is the determination of legitimate values for each property and the usual format in which they are expressed. For instance, a reference to Title 52nd of the US Code, or to section XIV a US Code title, may be understood as strange only by knowing that US Code only has 51 titles and they are numbered with arabic numbers.

the following sentence:
 "When it is impractical to fully list the legitimate values of a property (for instance, the names of the document types could be too numerous to collect and list) the SC will describe the values as belonging to an "Open Vocabulary", and describe, if appropriate, the rules to convert natural values into values acceptable by the approved syntax, e.g., by removing spaces, special characters and stop words (so that, for instance, 'Decreto del Presidente del Consiglio dei Ministri' in Italy could become 'DecretoPresidenteConsiglioMinistri'), or by adopting locally known abbreviations (so that, for instance, it could become 'dpcm')."

I share Catherine's concern, and agree that naming things is a Sysyphean task.

Some further clarification of this division between "open vocabulary" and "property list" would be a useful comfort. I'm imagining that an admin might have a good big bundle of items that he or she wants to validate before unleashing it on the world. I can imagine that there might well be fields in the set that contain values that fall outside of a given property list. There might also be items that have misnamed fields, or which are missing required values. In those latter cases, the data set is clearly invalid, but I'm thinking that validation should be more forgiving of property list errors, at least for many fields.

A thought, anyway.



3) Necessity of a resolution step
From: Monica Palmirani, on April 29, in message https://www.oasis-open.org/apps/org/workgroup/legalcitem-technical/email/archives/201404/msg00011.html
Issue: "we need probably to add some paragraphs in the first deliverable concerning the resolver definition. I suppose we are thinking a LegalciteM IRI syntax able to foster a resolver, at least, for converting the "logic uri reference" into the "physical url(s)";"
Discussion: the current text does say:
> Locators MAY act as identifiers, which MAY be used as references for citations, but locators are in general POOR identifiers in the legal domain for a number of reasons explained in the following. Therefore, while tempting, it is naive and limiting to conclude that the purpose of this TC is to determine how to associate the locator of a document to a legal citation. It is rather more precise to say that the purpose of this TC is to determine a way to associate resolvable identifiers to a legal citation.
but in fact in the following these reasons are not explained as promised.
Proposal: At the end of the document, after
> Also, subcommittees should also identify how the references to documents of their classes are impacted by the layered view of documents. For instance, in legislation it is customary to differentiate static references to a specific version of a document (e.g. when modifying a statute) and dynamic references that are automatically updated to the new version of the linked document whenever it changes.
add a new paragraph stating:
"It will be a rare case indeed the citation (and therefore the need for a reference) pointing to an FRBR Item (i.e., to a specific file on a specific computer at a specific IP address) or to an FRBR Manifestation (i.e., to a specific characterization in a specific file format of a document). Most frequently a citation points to a legal document existing on a different conceptual layer and in a different level of reality than the physical copies it is embodied by, or by the data formats in which each copy is expressed. More frequently, therefore the citation will identify a document at a more abstract level, e.g., an FRBR _expression_ when the citation is to a specific version or variant of the document, or an FRBR Work when the citation is to all these versions or variants, or to the one that is identified through a possibly complex contextualization process. In these cases, therefore, the citation MUST be converted to a reference to a Work or an _expression_, which is resolved into the physical Locator of the Item only when needed, therefore separating the legal aspects of the identification of the correct version and variant of a document from the technical aspects of the dereferencing of a resource on the World Wide Web."

4) Relationship between work, _expression_ and manifestation levels
From: Monica Palmirani, on April 29, in message https://www.oasis-open.org/apps/org/workgroup/legalcitem-technical/email/archives/201404/msg00011.html
Issue: "the resolver should also to manage the relationship between work, _expression_, manifestation levels in the different versions and languages;"
Discussion: I hope that the proposal to solve issue 3 also covers this issue.
Proposal: the same as issue 3.

5) Multiple identifiers
From: Monica Palmirani, on April 29, in message https://www.oasis-open.org/apps/org/workgroup/legalcitem-technical/email/archives/201404/msg00011.html
Issue: "It is interesting to introduce also the case to have multiple "logic uri(s)" for the same legal source.
The name of the "logic uri" should be univocal, not unique. Inside of the Legislative subcommittee we have discussed the possibility to have "alias" for the same legal source. Example: "Public Law 112-29, Sept 16,2011" is also cited in the text as "Leahy-Smith America Invents Act" or "125 STAT 284" depending to the type of document, so it is interesting to have alias for managing different cases."   "
Discussion: I agree.
Proposal: Add to
> identifier: a string univocally associated to a document to identify it. References may use identifiers, or may not. In particular, we must remember that references are representations of the citation, and not of the identifier that the citation resolves into.
the following sentence:
"It is also important to notice that legal documents are often cited in multiple ways; for instance, "Public Law 112-29, Sept 16,2011" is also frequently cited as "Leahy-Smith America Invents Act" or as "125 STAT 284". Similarly, we must consider the existence of multiple different identifiers that identify the same document. An identifier is therefore *univocal* (i.e., an identifier identifies one specific document), but not necessarily *unique* (i.e., many different identifiers may identify the same document). "

6) The thing being cited
From: Chet Ensign, on April 29, in message https://www.oasis-open.org/apps/org/workgroup/legalcitem-technical/email/archives/201404/msg00014.html
Issue: "Should we define a term for “the thing being cited?” Perhaps “resource.” I suggest some definition because we’ll likely find ourselves at least occasionally talking about behaviors around what is being cited and having a generic term that we’ve defined may be helpful there.  The term would also encompass ‘part of a document’ in addition to ‘the document.’ And it may help us to think of citable things as more than just documents or sections of documents."
Discussion: I agree.
Proposal:
- Change
> an explicit, plain-text, human-readable mention of a legal text as found in another text
into
“an explicit, plain-text, human-readable mention in a text referring to a resource located somewhere else (whether the same document or a different document)"
- Add a new definition in the list of the "Vocabulary" section, as follows:
"Resource: the thing being cited. This is most often a legal document, a fragment of a legal document, or (a fragment of) a specific version or variant of a legal document. Nonetheless, there will be cases in which the thing being cited in a citation is not even a text document, as the case of a frame in a video offered as evidence in a criminal trial. The term resource, therefore, encompasses a variety of possible destinations of legal citations."

7) Deixis
From: Chet Ensign, on April 29, in message https://www.oasis-open.org/apps/org/workgroup/legalcitem-technical/email/archives/201404/msg00014.html
Issue: "For citation, I suggest replacing the word ‘deixis’ (which I had to look up and then roll the definition around in my head for several minutes) to a word more common? Perhaps ‘contextual’ which seems close to the meaning at least to me. Also, how does ‘implicit’ differ from ‘deixis’? I see the difference in the example but I think in practice they might get mixed up"
Discussion: I agree with the need to explain better deixis. I am hesitant in giving it up, though. As for the difference between implicit and deictic, it IS important in our context: an implicit citation is a citation where the referred document IS NOT identified, and one must have domain-specific knowledge to interpret and access, while a deictic citation is a citation where the referred document IS identified in a linguistically indirect way, e.g. through the use of pronouns or deictic adjectives (such as 'this', 'that', 'previous', 'following', etc.). In particular, the disambiguation of a deictic citation is a merely linguistic operation (one must only understand the language the citation is expressed in), while the disambiguation of an implicit citation may require subtle legal reasoning to determine the cited document.
Proposal:
- Replace
> Citations might be full and explicit (e.g., "42 U.S.C. § 405(c)(2)"), deixis (e.g. "section 1423 of this Title", "clause c of the aforementioned section", etc.), or implicit ("all relevant legislation on this topic"). It is NOT the purpose of this TC to establish a proposed syntax for citations.
with
"Citations might be full and explicit (e.g., "42 U.S.C. § 405(c)(2)"), contextual  - or, more appropriately, deictic (e.g. "section 1423 of this Title", "clause c of the aforementioned section", etc.), or implicit ("all relevant legislation on this topic"). It is NOT the purpose of this TC to establish a proposed syntax for citations.

8) Existence and uniqueness of identifiers
From: Chet Ensign, on April 29, in message https://www.oasis-open.org/apps/org/workgroup/legalcitem-technical/email/archives/201404/msg00014.html
Issue: "For identifier, I think we should also note that (a) resources are not necessarily guaranteed to have identifiers and (b) that identifiers cannot be guaranteed to be unique? "
Discussion: I disagree with a) and agree with b).
a) identifiers: resources MUST have identifiers, at least at a conceptual level. It is true that resources may not have a LOCATOR, but an identifier, even one that confesses that there is no resolvable locator associated, must be determined. I think the purpose of this TC is in fact to provide a generic syntax for identifiers for documents that can be conceptualized, regardless of whether they exist, they existed, they will exist, they may exist or they may never exist. We should be able to refer to Title 52 of the US Code even though no such code exists.
b) I agree that identifiers are not unique. They must be univocal. See discussion on issue #5.
Proposal:
a) No action
b) see proposal for issue #5.

9) Definition of Locator
From: Chet Ensign, on April 29, in message https://www.oasis-open.org/apps/org/workgroup/legalcitem-technical/email/archives/201404/msg00014.html
Issue: "For locator, two suggestions - that we generalize it from just urls and that we stick here to the term’s definition and not comment on its suitability for references. I’m thinking of something more like this: “locator: a string containing instructions (e.g. protocol, address, filename) that can be used to access a resource. Locators are strongly dependent on technical, architectural and organizational choices of the owner of the system in which the resource is to be found.”
I could see how a citation might, in the computer catalog of a large library, resolve not to an electronic address but to a set of instructions telling the reader where on the shelves to find the cited work. Also, I agree with the statement of the purpose of the last sentence but I think that it should move down to follow after the definitions. "
Discussion:
I kind of like the sentence "locators are in general POOR identifiers in the legal domain" . I would like it to stick in the minds of people. I have listened too many times to people convinced that one just needs to define a specific syntax for URLs, forgetting the issues connected to dynamic references, open references, multiple versions and multiple variants, that make sure that no Locator will ever be able to solve all issues, and that a resolution step will in most cases be necessary.
Proposal:
Replace
> locator (also URL): documents that are accessible on the World Wide Web are accessible via a Locator, or URL, that provides information about the protocol, the Internet address, and the local path to get such document. Thus, locators are strongly dependent on technical, architectural and organizational choices of the owner of the computer on which the document resides. Locators MAY act as identifiers, which MAY be used as references for citations, but locators are in general POOR identifiers in the legal domain for a number of reasons explained in the following. Therefore, while tempting, it is naive and limiting to conclude that the purpose of this TC is to determine how to associate the locator of a document to a legal citation. It is rather more precise to say that the purpose of this TC is to determine a way to associate resolvable identifiers to a legal citation.
with
"locator (also URL): a string containing instructions (e.g. protocol, address, filename) that can be used to access a resource. Locators are strongly dependent on technical, architectural and organizational choices of the owner of the system in which the resource is to be found. Locators MAY act as identifiers, which MAY be used as references for citations, but locators are in general POOR identifiers in the legal domain for a number of reasons explained in the following. Therefore, while tempting, it is naive and limiting to conclude that the purpose of this TC is to determine how to associate the locator of a document to a legal citation. It is rather more precise to say that the purpose of this TC is to determine a way to associate resolvable identifiers to a legal citation."

10) Dereference
From: Chet Ensign, on April 29, in message https://www.oasis-open.org/apps/org/workgroup/legalcitem-technical/email/archives/201404/msg00014.html
Issue: "For deference, I agree with the definition but suggest we make it more general. “the action of delivering a copy of the resource specified by the locator, for example, by traversing a link to deliver a copy to the requesting application. "
Discussion: I agree
Proposal:
- Replace
> dereference: the action of delivering via Internet a copy of the document specified by a Locator. The dereference is the final action of traversing a link (navigating), the one in which the requested document is actually delivered to the requesting application.

with
"dereference: the action of delivering a copy of the resource specified by the locator, for example, by traversing a link to deliver a copy to the requesting application."



--

Fabio Vitali                            Tiger got to hunt, bird got to fly,
Dept. of Computer Science        Man got to sit and wonder "Why, why, why?'
Univ. of Bologna  ITALY               Tiger got to sleep, bird got to land,
phone:  +39 051 2094872              Man got to tell himself he understand.
e-mail: fabio@cs.unibo.it         Kurt Vonnegut (1922-2007), "Cat's cradle"
http://vitali.web.cs.unibo.it/





---------------------------------------------------------------------
To unsubscribe from this mail list, you must leave the OASIS TC that
generates this mail.  Follow this link to all your TCs in OASIS at:
https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php





--

/chet 
----------------
Chet Ensign
Director of Standards Development and TC Administration 
OASIS: Advancing open standards for the information society
http://www.oasis-open.org

Primary: +1 973-996-2298
Mobile: +1 201-341-1393 

Check your work using the Support Request Submission Checklist at http://www.oasis-open.org/committees/download.php/47248/tc-admin-submission-checklist.html 

TC Administration information and support is available at http://www.oasis-open.org/resources/tcadmin

Follow OASIS on:
LinkedIn:    http://linkd.in/OASISopen
Twitter:        http://twitter.com/OASISopen
Facebook:  http://facebook.com/oasis.open



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]