legalcitem-technical message

Subject: Re: [legalcitem-technical] Comparison of ELI, Akoma Ntoso (AKN), and LLL (USLM) referencing schemes.
From: Fabio Vitali <fvitali@gmail.com>
To: Frank Bennett <biercenator@gmail.com>
Date: Tue, 26 May 2015 11:34:08 +0200
Dear all, 

a summary on what I understood of the discussion. 

Country and jurisdiction are a topic. Naively, I thought of jurisdictions as sub-country geographical or administrative entities, but Frank correctly pointed out that this is not the whole story. 

So this is what I currently think: 

1) the correct way to refer to a jurisdiction is through a hierarchy of parts. No-one will want to find out directly about the municipality of Budrio (a small town, 5000 persons, near Bologna), so we must specify the full hierarchy of parts to identify it, say: 

country: Italy, region: Emilia-Romagna, municipality: Budrio. 

2) At the top of the hierarchy we have a Top Level Jurisdiction (TLJ), which for practical purpose shall be defined as a "country (with exceptions)". 

3) we are not in the business of deciding which jurisdictions are countries and which aren't, and what are its sub-jurisdiction and how are they called. We MUST delegate this to an external authority. 

4) The best candidate that I know for an authority on TLJ is ISO 3166. This provides criteria for the specification of countries and their codes. ISO 3166 Maintenance Agency was constituted in the seventies and has given out codes for all countries that existed in and since 1974. Codes are not reused, so countries that dismembered after 1974 still have their own codes (which is useful in our case). ISO 3166 gives out also codes for first level sub-jurisdictions in many cases (e.g., individual US states) but does not go further than this. Many international organizations (including UN) have their own codes. 

5) This approach provides a good solution in many cases, but there are a few that remain uncovered: 

5-a) unlisted sub-jurisdictions of existing jurisdictions (e.g.: Budrio). Proposal: full name. 
5-b) unrecognized countries (including ministates such as Liberland)
5-c) disputed lands
5-e) Unlisted international entities
5-f) historical jurisdictions (e.g., in Kenya some legislation refers to East African British Colonies, a jurisdiction covering current Kenya, Tanzania and Uganda and that stopped existing  before 1974. Or, more simply, roman law). 
5-g) International treaties (about which I strongly object to list explicitly all signing countries). 

Let me know what you think. 

Ciao

Fabio

--

On 25/mag/2015, at 23:23, Frank Bennett <biercenator@gmail.com> wrote:

> On Tue, May 26, 2015 at 1:45 AM, monica.palmirani
> <monica.palmirani@unibo.it> wrote:
>> Hi Frank,
>> about the jurisdiction(s) it is better for me to have both elements:
>> - country
>> - jurisdiction(s)
>> 
>> Example: UK
>> Jurisdictions: England+Wales
>> http://www.legislation.gov.uk/ukpga/2015/23/introduction?view=extent
> 
> Separating the two exposes a general schema to political controversy
> over what constitutes a country:
> 
> United Nations
> Republic of Abkhazia
> London Court of International Arbitration
> World Trade Organization
> World Bank Group / ICSID
> Council of Europe
> 
> FB
> 
>> 
>> Yours,
>> mp
>> 
>> Il 25/05/2015 14:50, Frank Bennett ha scritto:
>>> 
>>> I have added a couple of comments below, on the recording of
>>> jurisdictions and authorities in feature-sets.
>>> 
>>> Frank
>>> 
>>> On Mon, May 25, 2015 at 3:53 PM, monica.palmirani
>>> <monica.palmirani@unibo.it> wrote:
>>>> 
>>>> Dear all,
>>>> 
>>>> my few comments below.
>>>> 
>>>> Yours,
>>>> Monica
>>>> 
>>>> 
>>>> Il 24/05/2015 22:26, Fabio Vitali ha scritto:
>>>>> 
>>>>> Dear all,
>>>>> 
>>>>> If there is no other contribution, I'll try to give a summary of what
>>>>> this
>>>>> discussion has unearthed:
>>>>> 
>>>>> 1) parsability of the reference is an issue. Rather, it's THE issue.
>>>>> 
>>>>> CT says:
>>>>>> 
>>>>>> It proved impossible to find a standard approach as values essential to
>>>>>> one country were completely irrelevant to another.
>>>>> 
>>>>> I fear that if this is the case, we will have to go through the same
>>>>> path
>>>>> that ELI did and see if we find ourselves in the same situation. We
>>>>> shall
>>>>> need to examine references that cannot be accommodated in a unique
>>>>> schema,
>>>>> and try to understand why and if there is a way out.
>>>>> 
>>>>> As a matter of principle, let me remind that the purpose of this
>>>>> exercise
>>>>> is NOT to find the best URI scheme, but to find the best features of
>>>>> each
>>>>> and combine them together to generate (from scratch, if needed) a schema
>>>>> that is better than all the existing ones.
>>>>> 
>>>>> GCV says;
>>>>>> 
>>>>>> The benefit of an open syntax is that any unforeseen use case can be
>>>>>> handled
>>>>> 
>>>>> I believe that there is gray area between the rigid "Only the things we
>>>>> now know about shall be needed for the identification" and the open "We
>>>>> will
>>>>> never know every use case so let's not stipulate anything in the ones we
>>>>> know". In particular, I believe we could and should stipulate rules for
>>>>> the
>>>>> features we already know about (e.g., country, language, dates, etc.)
>>>>> and
>>>>> leave room for the ones that may have differences in individual
>>>>> countries.
>>>>> 
>>>>> CT says:
>>>>>> 
>>>>>> Although ELI is flexible the expectation is that each official
>>>>>> legislation publisher decides on the fixed variation they will use and
>>>>>> then
>>>>>> lodges their URI scheme in the ELI register
>>>>> 
>>>>> I believe we have wider use cases: in particular, I do not want to
>>>>> restrict ourselves only to national legislation, for obvious reasons,
>>>>> nor
>>>>> only to official, authoritative resolvers managed by state-managed
>>>>> publishers. The issue of providing resolution is political and
>>>>> technical,
>>>>> while the issue of providing names is cultural and structural, and is
>>>>> the
>>>>> business we are in: resolvers and documents may come from anywhere with
>>>>> any
>>>>> type of authoritativeness.
>>>>>   CT says:
>>>>>> 
>>>>>> Most countries already have a website and are not in a position to be
>>>>>> able to create a new one from scratch. In order for them to implement
>>>>>> ELI it
>>>>>> needed to be flexible enough to fit in with existing architecture will
>>>>>> minimum alteration.
>>>>> 
>>>>> I see that this is an issue. But since we are considering to go through
>>>>> a
>>>>> resolver, we do not have the requirement to modify the physical URL of
>>>>> documents, but just the mechanisms through which the mapping between
>>>>> abstract URIs and concrete URLs is organized. Any kind of legacy naming
>>>>> convention lies at the end of this process, and should not be able to
>>>>> impact
>>>>> on it.
>>>>> 
>>>>> CT says:
>>>>>> 
>>>>>> If ELI has been implemented it's the http ELI identifier that returns
>>>>>> the
>>>>>> document on the national legislation website so there's no need for
>>>>>> parsing
>>>>>> for any additional resolution.
>>>>> 
>>>>> So ELI is actually a suggestion for physical naming of files so that it
>>>>> does not require resolution. I think that we CAN find a residual
>>>>> justification for this: e.g., if all countries implemented ELI for their
>>>>> physical naming schemes, then resolution from LegalCitem references to
>>>>> ELI
>>>>> names would be very easy.
>>>>> 
>>>>> CT says:
>>>>>> 
>>>>>> ELI was designed as a common official URI scheme it didn't really
>>>>>> consider citations or links for use on other legal websites.
>>>>> 
>>>>> Good to know. On the other hand, we here deal with references, so this
>>>>> is
>>>>> an issue we must consider. For instance, view date is not an issue in
>>>>> identifiers, but it is a BIG issue in references. I can provide details
>>>>> on
>>>>> this.
>>>>> 
>>>>> CT says:
>>>>>> 
>>>>>> However, saying that I do think that parseability would be useful. I
>>>>>> wonder if it would be possible to allow greater flexibility than a
>>>>>> universally fixed scheme by allowing some custom values and some kind
>>>>>> of
>>>>>> embedded statement or declaration about what's being used?
>>>>> 
>>>>> Yes, I think we should strive for that. Only, what you call custom
>>>>> values
>>>>> I prefer to call optional features. Would that work for you?
>>>>> 
>>>>> 2) Features
>>>>> It may be my impression, but when we'll examine the output of the
>>>>> various
>>>>> subgroups we will have to strive hard to identify info sufficiently
>>>>> organized into structured data that we will be able to place them as
>>>>> features in our schema.
>>>>> 
>>>>> So we should start looking into the "minimum required set of features
>>>>> that
>>>>> will be obviously needed".
>>>>> 
>>>>> A brief list off the top of my mind:
>>>>> 1) country
>>>>> 2) language
>>>>> 3) document type
>>>>> 4) creation date (for the subtle interpretations of what "creation"
>>>>> actually means, let's consider a separate discussion thread)
>>>>> 5) secondary dates (e.g. version date, view date, etc.)
>>>>> 
>>>> MP: for the WORK level we need also:
>>>> - jurisdiction is another important information different from the
>>>> "country"
>>>> information (e.g., UK case or judgments);
>>> 
>>> I would favor replacing country with jurisdiction entirely, and
>>> casting this feature as a path, with national jurisdictions rooted on
>>> the country. The sub-elements of non-nations (e.g. international
>>> bodies, or virtual jurisdictions representing mere categories, such as
>>> ad hoc arbitration panels) can be expressed in the same fashion. That
>>> is what urn:lex proposes (at section 2.4 of draft 09), and (in terms
>>> of structure, at least) it seems a good approach.
>>> 
>>>> - authority (e.g. Uruguay needs to add "chamber" and "senate" for
>>>> distinguish the same document in two different WORKs);
>>> 
>>> For what it's worth, I would favor using a path-like identifier
>>> to specify issuing bodies with rule-making authority as well, with
>>> committees and the like set as a separate feature.
>>> 
>>>> - number (e.g., Argentina uses only the unique number of bill for all the
>>>> legislative process without date because the number is enough to
>>>> unequivocally identify the document).
>>>> 
>>>> I totally agree that about "creation" date definition we need a separate
>>>> thread.
>>>> 
>>>> For now let me say that the "creation date" in legal domain could be:
>>>> "the date when a document assumes legal meaning in a given workflow step
>>>> according to a specific legal system regulation and following the rules
>>>> of
>>>> procedure defined by the authority emitting the document".
>>>> 
>>>> This means for instance, but not limited:
>>>> - the date of promulgation of the president for the acts in a Republic
>>>> for
>>>> of government
>>>> - the date of assent of the Queen for the acts where we have the
>>>> - the date of admission of a draft-proposal-bill as a BILL in the
>>>> parliament
>>>> process
>>>> - the date of emanation of a judgment
>>>> - the date of publication of the order of a day
>>>> - the date of creation of a generic document.
>>>> 
>>>>> CT says:
>>>>>> 
>>>>>> If you want to create a URI for this item of legislation as it stood at
>>>>>> a
>>>>>> particular point in time, e.g. "decree of 1st May 2015" as it stood on
>>>>>> 2015-05-20, you end up with two date values in your URI the different
>>>>>> date
>>>>>> formats are signifying this difference.
>>>>> 
>>>>> 
>>>>> Changing the syntax to distinguish between types of dates is a working
>>>>> approach. I don't like it, but I can see it working. Nonetheless, if we
>>>>> find
>>>>> a better approach, I would prefer it.. For instance, Akoma Ntoso uses
>>>>> the
>>>>> positional difference and special separators to obtain the same result.
>>>>> 
>>>>> GCV says:
>>>>>> 
>>>>>> I see {year} all the the time. For example
>>>>>> /{jurisdicion}/{year}/{sessionNum}/... I believe that notion of {year}
>>>>>> is in
>>>>>> keeping with ELI's notion of separating year, month, and day. I don't
>>>>>> have a
>>>>>> use case that extends beyond {year} in a legislative context.
>>>>> 
>>>>> 
>>>>> In a parsable syntax, we would KNOW that a date is a date, and would be
>>>>> able to correctly identify 2015 as a year-only date, 2015-05 as a
>>>>> month+year
>>>>> date, and 2015-05-24 as a full date. There are THREE cases, not two
>>>>> thousands. Can we expect the software to handle three types of date
>>>>> specifications?
>>>>> 
>>>>> So if I see something like /{jurisdiction}/2015/{sessionNum}/ I can
>>>>> understand that 2015 is a date, while if I see something like
>>>>> /{jurisdicion}/2015-05-24/123/ I can understand that 2015-05-24 is a
>>>>> full
>>>>> date. It CAN be done.
>>>>> 
>>>>>> Perhaps there are two notions of time that we must separate. The first
>>>>>> is
>>>>>> a construct that is purely for organizing information in some logical
>>>>>> structure. The second is its use in defining a point-in-time context.
>>>>>> The
>>>>>> structure {year}[/{month}[/{day}]] is useful when organizing
>>>>>> information --
>>>>>> not so much for establishing the temporal context.
>>>>> 
>>>>>   I don't see what 2015/04 can give you that 2015-05 can't, and see
>>>>> plenty
>>>>> of troubles that 2015/05 would bring (again, because of parsability) and
>>>>> 2015-05 would avoid.
>>>>> 
>>>>> GCV says:
>>>>>> 
>>>>>> Yeah, yeah, yeah. I think I live in that state too! ;-)
>>>>> 
>>>>> Ok, thanks. :-) So we agree that national language is a requirement even
>>>>> in situations where there is only ONE official language?
>>>>> 
>>>>> Let me confess that I sometimes fear that English-speaking countries
>>>>> tend
>>>>> to consider other languages, and countries with more than one language,
>>>>> as
>>>>> nuisances that would be better off by renouncing to their language and
>>>>> just
>>>>> adopt English.
>>>>> 
>>>>> GCV says:
>>>>> 
>>>>>> I brought this up as I have seen use cases, with bills, in which there
>>>>>> are multiple versions within a single day. These are corrections that
>>>>>> are
>>>>>> published after the initial version was published with an error. The
>>>>>> corrected reprint gets a new version.
>>>>> 
>>>>> Ok. This is a thing in which Akoma Ntoso comes short. We SHALL need to
>>>>> have both the date AND the version identifier in expressions. Good to
>>>>> know.
>>>>> 
>>>>> Ciao
>>>>> 
>>>>> Fabio
>>>>> 
>>>>> --
>>>>> 
>>>>> 
>>>>> On 20/mag/2015, at 13:17, "Tabone, Catherine"
>>>>> <Catherine.Tabone@nationalarchives.gsi.gov.uk> wrote:
>>>>> 
>>>>>> Hi everyone,
>>>>>>   A few additional comments from me, marked CT:
>>>>>>   Regards,
>>>>>> 
>>>>>> Catherine
>>>>>>   From: legalcitem-technical@lists.oasis-open.org
>>>>>> [mailto:legalcitem-technical@lists.oasis-open.org] On Behalf Of Grant
>>>>>> Vergottini
>>>>>> Sent: 18 May 2015 20:13
>>>>>> To: Fabio Vitali
>>>>>> Cc: Daniel LUPESCU; legalcitem-technical@lists.oasis-open.org
>>>>>> Subject: Re: [legalcitem-technical] Comparison of ELI, Akoma Ntoso
>>>>>> (AKN),
>>>>>> and LLL (USLM) referencing schemes.
>>>>>>   My thoughts below at GCV:
>>>>>>     2015-05-18 9:45 GMT-07:00 Fabio Vitali <fvitali@gmail.com>:
>>>>>> Dear all,
>>>>>> 
>>>>>> sorry for my silence. Since some subcommittees are starting to deliver
>>>>>> documents on which we can begin our work, it is now time to rekindle
>>>>>> our
>>>>>> activities and start meeting again with some regularities.
>>>>>> 
>>>>>> I'll rehash Daniel's answers to Grant document, this January, for the
>>>>>> discussion to start again. My comments are inline. Meet you then in ten
>>>>>> days
>>>>>> from now.
>>>>>> 
>>>>>> Ciao
>>>>>> 
>>>>>> Fabio
>>>>>> 
>>>>>> --
>>>>>> 
>>>>>> On 19/gen/2015, at 15:07, Daniel LUPESCU <dlupescu@sedona.fr> wrote:
>>>>>> 
>>>>>>> Hi all,
>>>>>>> 
>>>>>>> To prepare the next TC meeting, here are a few thoughts on Grant's
>>>>>>> comparison:
>>>>>>> 
>>>>>>> - FRBR WORK:
>>>>>>>      - ELI and USLM both have an open hierarchy, whereas AKN has a
>>>>>>> fixed
>>>>>>> hierarchy.
>>>>>> 
>>>>>> What do you mean by "open hierarchy"?
>>>>>> 
>>>>>> AKN has a specific order in features, so as to allow reliable parsing.
>>>>>> So
>>>>>> far we haven't found instances where this is a problem.
>>>>>> 
>>>>>>>          - Is it better to force a syntax or to leave it open? Both
>>>>>>> approaches have pros and cons.
>>>>>> 
>>>>>> Ok, let's try to list pros and cons, then.
>>>>>> 
>>>>>> On my part, rigorous parseable syntax is a necessary condition for true
>>>>>> internationalization, for they would allow people who've never seen a
>>>>>> reference from Gambia or Bolivia or Bhutan to understand at least a
>>>>>> little
>>>>>> what kind of document that is. I think that this is an important
>>>>>> requirement.
>>>>>> 
>>>>>> I don't have cons to list.
>>>>>>   GCV: I agree that a parseable structured syntax is better than the
>>>>>> open
>>>>>> one of ELI. The benefit of an open syntax is that any unforeseen use
>>>>>> case
>>>>>> can be handled - the downside is how you choose to handle it is
>>>>>> unspecified.
>>>>>> I'm guessing that the team behind ELI basically agreed to disagree on a
>>>>>> format and the result was a minimum of structure -- which also came
>>>>>> with a
>>>>>> minimum of usefulness.
>>>>>> 
>>>>>> 
>>>>>> CT: Personally I see pros and cons for each approach. A fixed pattern
>>>>>> is
>>>>>> easier to understand and parse without too much work as the pattern is
>>>>>> always the same. The problem is that there will inevitably be
>>>>>> legislation
>>>>>> that doesn't fit into the syntax pattern which means it either won't be
>>>>>> implemented at all or it will be implemented in a non-standard way
>>>>>> which
>>>>>> undermines the parsability. The more flexible structure is easier to
>>>>>> bend to
>>>>>> fit all the weird and wonderful examples of necessary legislation
>>>>>> citation.
>>>>>> It's disadvantage is that is harder to reuse.
>>>>>> 
>>>>>> Additionally couple of things to put the ELI approach into context:
>>>>>> 
>>>>>> The features that uniquely identify an item of legislation are very
>>>>>> different in different EU countries, e.g. some use a legislation type,
>>>>>> year
>>>>>> and series number, some use a signature or enactment date and
>>>>>> legislation
>>>>>> type, some use type, series number and parliamentary session, others
>>>>>> switch
>>>>>> schemes at some historical point. It proved impossible to find a
>>>>>> standard
>>>>>> approach as values essential to one country were completely irrelevant
>>>>>> to
>>>>>> another. ELI is designed not as an abstract identifier but to be the
>>>>>> URI
>>>>>> used to access the legislation item on an official publisher's website
>>>>>> for
>>>>>> each EU country. Most countries already have a website and are not in a
>>>>>> position to be able to create a new one from scratch. In order for them
>>>>>> to
>>>>>> implement ELI it needed to be flexible enough to fit in with existing
>>>>>> architecture will minimum alteration.
>>>>>> 
>>>>>> Although ELI is flexible the expectation is that each official
>>>>>> legislation publisher decides on the fixed variation they will use and
>>>>>> then
>>>>>> lodges their URI scheme in the ELI register (countries are still in the
>>>>>> process of implementation and the ELI register website is still being
>>>>>> set-up
>>>>>> so unfortunately there's nothing to look at yet). This means that if
>>>>>> you
>>>>>> wanted to parse ELI URIs you could do this but you would need to deal
>>>>>> with
>>>>>> the URI scheme for each country separately. This appears as more work
>>>>>> but I
>>>>>> wonder if it's actually equivalent to what you need to do to get an AKN
>>>>>> resolver to work? You do the work at a different point but essentially
>>>>>> you
>>>>>> are still trying to get the reference to resolve to the actual web
>>>>>> page.
>>>>>> 
>>>>>> 
>>>>>>>      - Date: USLM and AKN both use the format YYYY-MM-DD (or only
>>>>>>> YYYY),
>>>>>>> whereas ELI uses /YYYY/MM/DD.
>>>>>> 
>>>>>> As far as I can tell, ELI uses both YYYY/MM/DD and YYYYMMDD without
>>>>>> separators. the inconsistency is somewhat bothering me, I must confess.
>>>>>>   GCV: Bothers me too.
>>>>>>   CT: The reason for the different date formats is to allow for the
>>>>>> fact
>>>>>> that some countries may need two dates in the URI. In a significant
>>>>>> number
>>>>>> of European countries (although not the UK) the identifier for
>>>>>> legislation
>>>>>> is the signature date, i.e. it is the "decree of 1st May 2015" there is
>>>>>> no
>>>>>> series number. If you want to create a URI for this item of legislation
>>>>>> as
>>>>>> it stood at a particular point in time, e.g. "decree of 1st May 2015"
>>>>>> as it
>>>>>> stood on 2015-05-20, you end up with two date values in your URI the
>>>>>> different date formats are signifying this difference. It's also to
>>>>>> accommodate people who want to use the URI as a hackable search string
>>>>>> e.g.
>>>>>> decree/2015/05/01 is the specific document, decree/2015/05 is all the
>>>>>> documents in May, decree/2015 all documents in 2015.
>>>>>> 
>>>>>> 
>>>>>> As for AKN, the choice of YYYY-MM-DD comes from the date data format of
>>>>>> XML schema [1] and relax NG [2].
>>>>>> 
>>>>>>>          - ELI's format seems more powerful as it could potentially
>>>>>>> allow to retrieve all documents of a specific month using:
>>>>>>> /eli/{jurisdiction}/year/month
>>>>>> 
>>>>>> use case for this?
>>>>>>   GCV: I see {year} all the the time. For example
>>>>>> /{jurisdicion}/{year}/{sessionNum}/... I believe that notion of {year}
>>>>>> is in
>>>>>> keeping with ELI's notion of separating year, month, and day. I don't
>>>>>> have a
>>>>>> use case that extends beyond {year} in a legislative context.
>>>>>>   GCV: Perhaps there are two notions of time that we must separate.
>>>>>> The
>>>>>> first is a construct that is purely for organizing information in some
>>>>>> logical structure. The second is its use in defining a point-in-time
>>>>>> context. The structure {year}[/{month}[/{day}]] is useful when
>>>>>> organizing
>>>>>> information -- not so much for establishing the temporal context.
>>>>>> 
>>>>>>> - FRBR EXPRESSION:
>>>>>>>      - Language:
>>>>>>>          - USLM does not have any provisions for language, but it
>>>>>>> seems
>>>>>>> obvious that the language is needed.
>>>>>> 
>>>>>> I agree this is an issue. Not my thing, but I can foresee a future not
>>>>>> so
>>>>>> far where at leas some state adopt Spanish as another official language
>>>>>> with
>>>>>> English.
>>>>>>   GCV: Yeah, yeah, yeah. I think I live in that state too! ;-)
>>>>>> 
>>>>>>>          - Both AKN and ELI uses a three-letter code according to ISO
>>>>>>> 639-3
>>>>>>>      - Point-in-time Date and version : USLM use
>>>>>>> @{version|YYYY-MM-DD},
>>>>>>> AKN uses /@{version|YYYY-MM-DD}, ELI uses /YYYYMMDD/version
>>>>>>>          - ELI's format seems more powerful as both Point-in-time
>>>>>>> Date
>>>>>>> and Version can be present
>>>>>> 
>>>>>> Can you elaborate on this?
>>>>>> 
>>>>>> Furthermore, AKN also has the view date, which allows for the
>>>>>> specification of a version date for references whose exact version date
>>>>>> you
>>>>>> do not know.
>>>>>> 
>>>>>> /us/act/2010/124Stat119/en@2010-01-24 is a reference to the version
>>>>>> that
>>>>>> entered in force on 2010-01-24, while
>>>>>> 
>>>>>> /us/act/2010/124Stat119/en!2015-05-18 is a reference to the version
>>>>>> that
>>>>>> is in force today, regardless of when it was approved.
>>>>>>   GCV: I brought this up as I have seen use cases, with bills, in
>>>>>> which
>>>>>> there are multiple versions within a single day. These are corrections
>>>>>> that
>>>>>> are published after the initial version was published with an error.
>>>>>> The
>>>>>> corrected reprint gets a new version.
>>>>>> 
>>>>>>>          - Both AKN and USLM introduce the '@' character whereas ELI
>>>>>>> use
>>>>>>> '/' as the only delimiter.
>>>>>>>              I find it simpler to have just 1 delimiter, especially
>>>>>>> as
>>>>>>> it is not common to find the '@' character in URLs, but again both
>>>>>>> approaches have pros and cons
>>>>>> 
>>>>>> The basic idea of choosing different separator is to allow parsing. By
>>>>>> using only ONE separator, you cannot allow for optional elements and
>>>>>> still
>>>>>> keep automatic parseability.
>>>>>> 
>>>>>> GCV: I agree that having different limiters is better. Otherwise, you
>>>>>> have a set of delimited values and it's difficult, without applying
>>>>>> some
>>>>>> other probably unreliable heuristic, to know what is what.
>>>>>> 
>>>>>>>      - Portions: same remark about the '~' character as above
>>>>>> 
>>>>>> And same issue for parseability.
>>>>>> 
>>>>>>> - FRBR MANIFESTATION:
>>>>>>>      - ELI does handle manifestation level.
>>>>>>>      Existing examples :
>>>>>>> 
>>>>>>> 
>>>>>>> http://www.legifrance.gouv.fr/eli/loi/2011/12/29/ETSX1119227L/jo/texte/fr/pdf
>>>>>>> 
>>>>>>> 
>>>>>>> http://www.legifrance.gouv.fr/eli/loi/2011/12/29/ETSX1119227L/jo/texte/fr/rtf
>>>>>>> 
>>>>>>> http://www.legifrance.gouv.fr/eli/loi/2011/12/29/ETSX1119227L/jo/texte
>>>>>>> (no
>>>>>>> format means /html)
>>>>>> 
>>>>>> Ok. Thanks.
>>>>>> 
>>>>>> I believe there is ONE open issues that is fundamental:
>>>>>> 
>>>>>> 1) Is automatic parseability important? I strongly vote for yes
>>>>>> 
>>>>>> GCV: I vote yes too.
>>>>>>   This gives rise to a few dependent issues:
>>>>>> 
>>>>>> If parseability is NOT important, then
>>>>>> 
>>>>>> 2a) why is a common syntax useful? I mean, wouldn't a lookup table just
>>>>>> suffice for our purposes?
>>>>>> 
>>>>>> If parseability is important, then
>>>>>> 
>>>>>> 2b) are there other ways to parse references without enforcing a fixed
>>>>>> order and a variety of separators?
>>>>>>   GCV: I think that parseability is something that ELI sacrificed in
>>>>>> order to come to some agreement, with a substantial loss of usefulness.
>>>>>> I
>>>>>> believe that one of our core goals should be to establish a scheme that
>>>>>> allows for unambiguous parsing of key information, while still allowing
>>>>>> for
>>>>>> the widest range of possible use cases. Deciding what is the key
>>>>>> information
>>>>>> should be part of our focus.
>>>>>>   CT: I think there is a significant difference in context here. There
>>>>>> was no requirement to parse ELI URIs. If ELI has been implemented it's
>>>>>> the
>>>>>> http ELI identifier that returns the document on the national
>>>>>> legislation
>>>>>> website so there's no need for parsing for any additional resolution.
>>>>>> ELI
>>>>>> was designed as a common official URI scheme it didn't really consider
>>>>>> citations or links for use on other legal websites. However, saying
>>>>>> that I
>>>>>> do think that parseability would be useful. I wonder if it would be
>>>>>> possible
>>>>>> to allow greater flexibility than a universally fixed scheme by
>>>>>> allowing
>>>>>> some custom values and some kind of embedded statement or declaration
>>>>>> about
>>>>>> what's being used?
>>>>>> 
>>>>>> Thanks and ciao
>>>>>> 
>>>>>> Fabio
>>>>>> 
>>>>>> [1] http://www.w3.org/TR/xmlschema-2/#date
>>>>>> [2]
>>>>>> 
>>>>>> https://www.safaribooksonline.com/library/view/relax-ng/0596004214/re91.html
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> 
>>>>>> 
>>>>>>> Best regards,
>>>>>>> 
>>>>>>> Daniel Lupescu
>>>>>>> 
>>>>>>> SEDONA
>>>>>>> 10 Place de la Madeleine 75008 Paris
>>>>>>> Tel: 01 83 64 51 61
>>>>>>> dlupescu@sedona.fr
>>>>>>> 
>>>>>>> From: "Grant Vergottini" <grant.vergottini@xcential.com>
>>>>>>> To: legalcitem-technical@lists.oasis-open.org, "Fabio Vitali"
>>>>>>> <fabio@cs.unibo.it>
>>>>>>> Sent: Mercredi 7 Janvier 2015 20:44:39
>>>>>>> Subject: [legalcitem-technical] Comparison of ELI, Akoma Ntoso (AKN),
>>>>>>> and LLL (USLM) referencing schemes.
>>>>>>> 
>>>>>>> Hi all,
>>>>>>> 
>>>>>>> Below is my comparison of the three URL based referencing schemes.
>>>>>>> They
>>>>>>> are
>>>>>>> truly very similar. I took a stab at the ELI scheme despite not having
>>>>>>> too much
>>>>>>> familiarity with it. Please correct as necessary.
>>>>>>> 
>>>>>>> Comparison of ELI, AKN, and LLL (USLM) legislative referencing schemes
>>>>>>> ======================================================================
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> -------------------------------------------------------------------------------
>>>>>>> BASIC SYNTAX
>>>>>>> 
>>>>>>> eli:
>>>>>>> 
>>>>>>> /eli[/{jurisdiction}][/{agent}][/{subAgent}][/{year}][/{month}][/{day}][/{type}][/{naturalIdentifier}][/{level…}][/{pointInTime}][/{version}][/{lang}]
>>>>>>> akn:
>>>>>>> 
>>>>>>> /akn/{jurisdiction}/{docType}/{docNum}[/{docDate}][[{lang}]@[{versionOrPointInTime}]][~{portionId}][/{source][.{format}]
>>>>>>> uslm:
>>>>>>> 
>>>>>>> /uslm/{jurisdiction}/{docSet...}/{docName}[{level...}][@{versionOrPointInTime}][~{portionId}][.{format}]
>>>>>>> 
>>>>>>> Where:
>>>>>>> parenthesis denote variable fields
>>>>>>> square brackets denote optional parts
>>>>>>> an ellipsis at the end of text in parenthesis denotes 1 or more
>>>>>>> occurrences
>>>>>>> 
>>>>>>> Comments:
>>>>>>>      All elements in ELI are optional as is their order.
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> -------------------------------------------------------------------------------
>>>>>>> STARTING MATTER
>>>>>>> 
>>>>>>> eli:  /eli/{jurisdiction}
>>>>>>> akn:  /akn/{jurisdiction}
>>>>>>> uslm: /uslm/{jurisdiction}
>>>>>>> 
>>>>>>> All three schemes essentially take the same approach for the initial
>>>>>>> part of the URL, differing only in the scheme name.
>>>>>>> 
>>>>>>> 
>>>>>>> -------------------------------------------------------------------------------
>>>>>>> FRBR ORGANIZATION
>>>>>>> 
>>>>>>> For the sake of organizational discussion, and taking the lead from
>>>>>>> Akoma Ntoso, the basic syntax after the starting matter is being
>>>>>>> divided into four parts, borrowing the FRBR concepts of WORK,
>>>>>>> EXPRESSION, and MANIFESTATION.
>>>>>>> 
>>>>>>> While not a perfect match, all three schemes are being coerced to
>>>>>>> follow this model as follows
>>>>>>> 
>>>>>>> eli:
>>>>>>>    Work:
>>>>>>> 
>>>>>>> [/{agent}][/{subAgent}][/{year}][/{month}][/{day}][/{type}][/{naturalIdentifier}]
>>>>>>>    Expression: [/{level...}][/{pointInTime}][/{version}][/{lang}]
>>>>>>>    Manifestation:
>>>>>>> akn:
>>>>>>>    Work: /{docType}/{docNum}[/{docDate}]
>>>>>>>    Expression: [[{lang}]@[{versionOrPointInTime}]][~{portionId}]
>>>>>>>    Manifestation: [/{publisher][.{format}]
>>>>>>> uslm:
>>>>>>>    Work: /{docSet...}/{docName}
>>>>>>>    Expression: [{level...}][@{versionOrPointInTime}][~{portionId}]
>>>>>>>    Manifestation: [.{format}][?source={publisher}]
>>>>>>> 
>>>>>>> Comments:
>>>>>>>     Does ELI handle the manifestation level?
>>>>>>>     The {level...} and the {portionId} specification (discussed
>>>>>>> below)
>>>>>>> are considered as part of the expression as the are time variant.
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> -------------------------------------------------------------------------------
>>>>>>> THE WORK
>>>>>>> 
>>>>>>> eli:
>>>>>>> 
>>>>>>> [/{agent}][/{subAgent}][/{year}][/{month}][/{day}][/{type}][/{naturalIdentifier}]
>>>>>>> akn:  /{docType}/{docNum}[/{docDate}]
>>>>>>> uslm: /{docSet...}/{docName}
>>>>>>> 
>>>>>>> Path to document:
>>>>>>>     All three schemes can be seen to be defining a part to a document
>>>>>>> resource.
>>>>>>> 
>>>>>>>     ELI defines seven named parameters which can be used, presumably
>>>>>>> in
>>>>>>> any hierarchy.
>>>>>>> 
>>>>>>>     Akoma Ntoso defines a fixed hierarchy based on document type,
>>>>>>> number, and date. USLM defines an open hierarchy.
>>>>>>> 
>>>>>>> Document Name:
>>>>>>>     ELI describes a natural identifier as the name of the document.
>>>>>>> 
>>>>>>>     Akoma Ntoso partitions out the document name, number, and date.
>>>>>>> 
>>>>>>>     USLM defines a docName that is apparently analogous to ELI's
>>>>>>> natural identifier.
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> -------------------------------------------------------------------------------
>>>>>>> THE EXPRESSION
>>>>>>> 
>>>>>>> eli:  [/{level...}][/{pointInTime}][/{version}][/{lang}]
>>>>>>> akn:  [[{lang}]@[{versionOrPointInTime}]][~{portionId}]
>>>>>>> uslm: [{level...}][@{versionOrPointInTime}][~{portionId}]
>>>>>>> 
>>>>>>> Language:
>>>>>>>     ELI permits a hierarchical level corresponding to the language.
>>>>>>> 
>>>>>>>     Akoma Ntoso encodes the lang to precede the '@' character.
>>>>>>> 
>>>>>>>     USLM does not have any provisions for language - English is the
>>>>>>> sole language for US legislation.
>>>>>>> 
>>>>>>> Point-in-time Date:
>>>>>>>     ELI permits a hierarchical level corresponding to the
>>>>>>> point-in-time
>>>>>>> date.
>>>>>>> 
>>>>>>>     Akoma Ntoso encodes the point-in-time date to follow the '@'
>>>>>>> character.
>>>>>>> 
>>>>>>>     USLM encodes the point-in-time date to follow the '@' character.
>>>>>>> 
>>>>>>>     Note: There is a subtle difference between Akoma Ntoso and USLM
>>>>>>> when there is no language specification. Akoma Ntoso would express a
>>>>>>> simple URI as {work}/@{point-in-time} while USLM would express the
>>>>>>> same as {work}@{point-in-time}. Note the missing '/' in the USLM
>>>>>>> notation. This is because USLM does not have any provision for the
>>>>>>> {lang} prior to the '@' character -- or in any other place, for that
>>>>>>> matter..
>>>>>>> 
>>>>>>> Version:
>>>>>>>     ELI permits a version number as a hierarchical level.
>>>>>>> 
>>>>>>>     Akoma Ntoso permits a version number in place of a version date
>>>>>>> following an '@'. (Fabio???)
>>>>>>> 
>>>>>>>     USLM permits a version number in place of a version date
>>>>>>> following
>>>>>>> an '@'.
>>>>>>> 
>>>>>>> Portions:
>>>>>>>     ELI permits a document portion to be expressed as levels within
>>>>>>> the
>>>>>>> natural hierarchy of the reference.
>>>>>>> 
>>>>>>>     Akoma Ntoso permits a portion to be specified in the form of an
>>>>>>> identifier query following a tilde '~'. In Akoma Ntoso, identifiers
>>>>>>> (specifically the @wId and @eId attributes) are expressed as after
>>>>>>> pseudo-hierarchy following a strictly defined nomenclature. Double
>>>>>>> underscores '__' denote the hierarchical levels. Level prefixes are
>>>>>>> prescribed.
>>>>>>> 
>>>>>>>     USLM takes a similar approach to ELI, permitting a document
>>>>>>> portion
>>>>>>> to be expressed as levels within the natural hierarchy of the
>>>>>>> reference. This is because different implementation of the US Code
>>>>>>> partition the US Code into individual documents differently. The value
>>>>>>> of the USLM level hierarchy corresponds exactly to the value of the
>>>>>>> @identifier attribute within the XML -- similar in nature to how the
>>>>>>> Akoma Ntoso ~portionId corresponds to the @eId. Prescribed level
>>>>>>> prefixes are used for "big" levels (section and above) and omitted for
>>>>>>> "small" levels (below the section).
>>>>>>> 
>>>>>>>     USLM also supports the ~{portionId} method, but USLM @id values
>>>>>>> are
>>>>>>> expressed as GUIDs rather than as meaningful identifiers. Due to the
>>>>>>> unreadability of USLM @id's and instability maintaining @id values,
>>>>>>> this method in USLM is only for short-term usages and should never be
>>>>>>> persisted.
>>>>>>> 
>>>>>>> Ranges:
>>>>>>>     USLM has provisions for ranges within the level hierarchy
>>>>>>> expressed
>>>>>>> to reference a portion of a document. The basic nomenclature is a
>>>>>>> series of three periods "...". This is used between two numbers at a
>>>>>>> level. For instance sec1...5 is the range of sections between and
>>>>>>> including sections 1 and 5. If the terminating number is omitted, that
>>>>>>> signifies an open end and corresponds to "et seq." in a citation.
>>>>>>> Complex non-contiguous ranges are expressed as a sequence of
>>>>>>> references. There is no provision for a sequence which starts at one
>>>>>>> level and ends at a different level -- this is proving problematic.
>>>>>>> Disambiguation:
>>>>>>>     USLM has provisions for disambiguating duplicate numbers. It is
>>>>>>> quite common, and unfortunate, that provisions are often misnumbered
>>>>>>> and duplicate numbering accidentally occurs. For instance, it is
>>>>>>> possible that two unrelated section 1234 come into existence and are
>>>>>>> simultaneously both effective law. In citations, this is handled
>>>>>>> through "qualifying" or "disambiguating" language such as "as added
>>>>>>> by..." or "as amended by...". USLM allows this reference to the
>>>>>>> originating or amending provision to be encoded to follow
>>>>>>>     the level in the hierarchy containing the ambiguity within square
>>>>>>> brackets. For instance:
>>>>>>> 
>>>>>>> 
>>>>>>> /uslm/us/usc/t100/div4/chap5[/uslm/us/usc/pl/2014/200/sec3]/art1
>>>>>>> 
>>>>>>>     indicates that there are duplicate chapter 5s and that the one
>>>>>>> created or modified by /uslm/us/usc/pl/2014/200/sec3 is the one being
>>>>>>> referred to.
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> -------------------------------------------------------------------------------
>>>>>>> THE MANIFESTATION
>>>>>>> 
>>>>>>> eli:  ???
>>>>>>> akn:  [/{publisher][.{format}]
>>>>>>> uslm: [.{format}][?source={publisher}]
>>>>>>> 
>>>>>>>     ELI does not appear to deal with the manifestation level. ???
>>>>>>> 
>>>>>>>     Akoma Ntoso allows the publisher and the file format to be
>>>>>>> specified. The file format is always at the end of the reference and
>>>>>>> appears as a normal file extension -- permitting 3 or 4 characters
>>>>>>> 
>>>>>>>     USLM allows the file format to be specified. Similar to Akoma
>>>>>>> Ntoso, the file format is always at the end of the reference and
>>>>>>> appears as a normal file extension. USLM has no provisions to specify
>>>>>>> the data source or publisher within the USLM schema. Instead the
>>>>>>> &source query parameter has been used to provide that function.
>>>>>>> 
>>>>>>> 
>>>>>>> -- Grant
>>>>>>> ____________________________________________________________________
>>>>>>> Grant Vergottini
>>>>>>> Xcential Group, LLC.
>>>>>>> email: grant.vergottini@xcential.com
>>>>>>> phone: 858.361.6738
>>>>>>> 
>>>>>>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe from this mail list, you must leave the OASIS TC that
>>>>>>> generates this mail.  Follow this link to all your TCs in OASIS at:
>>>>>>> https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> 
>>>>>> Fabio Vitali                            Tiger got to hunt, bird got to
>>>>>> fly,
>>>>>> Dept. of Computer Science        Man got to sit and wonder "Why, why,
>>>>>> why?'
>>>>>> Univ. of Bologna  ITALY               Tiger got to sleep, bird got to
>>>>>> land,
>>>>>> phone:  +39 051 2094872              Man got to tell himself he
>>>>>> understand.
>>>>>> e-mail: fabio@cs.unibo.it         Kurt Vonnegut (1922-2007), "Cat's
>>>>>> cradle"
>>>>>> http://vitali.web.cs.unibo.it/
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>   --
>>>>>> ____________________________________________________________________
>>>>>> Grant Vergottini
>>>>>> CEO & Founder, Xcential
>>>>>> email: grant.vergottini@xcential.com
>>>>>> phone: 858.361.6738
>>>>>> 
>>>>>> This email was scanned by the Government Secure Intranet anti-virus
>>>>>> service supplied by Vodafone in partnership with Symantec. (CCTM
>>>>>> Certificate
>>>>>> Number 2009/09/0052.) In case of problems, please call your
>>>>>> organisations IT
>>>>>> Helpdesk.
>>>>>> Communications via the GSi may be automatically logged, monitored
>>>>>> and/or
>>>>>> recorded for legal purposes.
>>>>>> 
>>>>>> 
>>>>>> Please don't print this e-mail unless you really need to.
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> -----------------------------------------------------------------------------------
>>>>>> 
>>>>>>   National Archives Disclaimer
>>>>>>   This email and any files transmitted with it are intended solely for
>>>>>> the use of the
>>>>>> individual(s) to whom they are addressed. If you are not the intended
>>>>>> recipient and
>>>>>> have received this email in error, please notify the sender and delete
>>>>>> the email.
>>>>>> Opinions, conclusions and other information in this message and
>>>>>> attachments that do
>>>>>> not relate to the official business of The National Archives are
>>>>>> neither
>>>>>> given nor
>>>>>> endorsed by it.
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> ------------------------------------------------------------------------------------
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe from this mail list, you must leave the OASIS TC that
>>>>> generates this mail.  Follow this link to all your TCs in OASIS at:
>>>>> https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php
>>>>> 
>>>>> .
>>>>> 
>>>> 
>>>> --
>>>> ===================================
>>>> Associate professor of Legal Informatics
>>>> School of Law
>>>> Alma Mater Studiorum Università di Bologna
>>>> C.I.R.S.F.I.D. http://www.cirsfid.unibo.it/
>>>> Palazzo Dal Monte Gaudenzi - Via Galliera, 3
>>>> I - 40121 BOLOGNA (ITALY)
>>>> Tel +39 051 277217
>>>> Fax +39 051 260782
>>>> E-mail  monica.palmirani@unibo.it
>>>> ====================================
>>>> 
>>>> 
>>>> 
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe from this mail list, you must leave the OASIS TC that
>>>> generates this mail.  Follow this link to all your TCs in OASIS at:
>>>> https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php
>>> 
>>> .
>>> 
>> 
>> 
>> --
>> ===================================
>> Associate professor of Legal Informatics
>> School of Law
>> Alma Mater Studiorum Università di Bologna
>> C.I.R.S.F.I.D. http://www.cirsfid.unibo.it/
>> Palazzo Dal Monte Gaudenzi - Via Galliera, 3
>> I - 40121 BOLOGNA (ITALY)
>> Tel +39 051 277217
>> Fax +39 051 260782
>> E-mail  monica.palmirani@unibo.it
>> ====================================
>> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe from this mail list, you must leave the OASIS TC that
> generates this mail.  Follow this link to all your TCs in OASIS at:
> https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php
> 
> 


--

Fabio Vitali                                          The sage and the fool
Dept. of Informatics                                     go to their graves
Univ. of Bologna  ITALY                               alike in this respect:
phone:  +39 051 2094872                  both believe the sage to be a fool.
e-mail: fabio@cs.unibo.it                  Where, then, may wisdom be found?
http://vitali.web.cs.unibo.it/   Qi, "Neither Yes nor No", The codeless code
References:
- FW: [legalcitem-technical] Comparison of ELI, Akoma Ntoso (AKN), and LLL (USLM) referencing schemes.
  - From: "Tabone, Catherine" <Catherine.Tabone@nationalarchives.gsi.gov.uk>
- Re: [legalcitem-technical] Comparison of ELI, Akoma Ntoso (AKN), and LLL (USLM) referencing schemes.
  - From: Fabio Vitali <fvitali@gmail.com>
- Re: [legalcitem-technical] Comparison of ELI, Akoma Ntoso (AKN), and LLL (USLM) referencing schemes.
  - From: monica.palmirani <monica.palmirani@unibo.it>
- Re: [legalcitem-technical] Comparison of ELI, Akoma Ntoso (AKN), and LLL (USLM) referencing schemes.
  - From: Frank Bennett <biercenator@gmail.com>
- Re: [legalcitem-technical] Comparison of ELI, Akoma Ntoso (AKN), and LLL (USLM) referencing schemes.
  - From: monica.palmirani <monica.palmirani@unibo.it>
- Re: [legalcitem-technical] Comparison of ELI, Akoma Ntoso (AKN), and LLL (USLM) referencing schemes.
  - From: Frank Bennett <biercenator@gmail.com>