OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

legaldocml-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [legaldocml-comment] Comment on Akoma Ntoso Naming Convention


Rather than embedding further comments, I'll copy here:

>[FV] We stand by our choice of using for IRI references those terms and productions that have been used for URI references and that can be used for IRI references unchanged.

I completely agree with your choice of IRIs throughout. I only wished to emphasize that your citations should therefore be primarily to RFC 3987, as well as this being an additional justification for not using the terminology of "absolute-path reference" in ANNC.

>[FV] We introduced and relied from the ground up on FRBR exactly for this reason. FRBR is VERY IMPORTANT for the Akoma Ntoso Naming Convention.

I don't see how the use of FRBR affects the argument about resolution of IRI references. An internet resource, using "resource" in the sense of RDF, may denote an abstraction or something physical. In either case, there is no difference in regarding to how IRI references to it are handled. And in the case of global IRI references in an Akoma Ntoso document, every one is resolved in the same way - by prepending a particular scheme (http://) and authority (varies from one provider to another, but is the same for all global IRI references in that document).

> [FV] AKoma Ntoso IS NOT INTERESTED in having direct resolution of IRI references to physical IRIs

I don't know what you mean by "physical IRIs".

> [FV] /akn/country/act/X/language@/og.xml

Let me see if I have this straight. Since you say that only the absolute IRI is the identifier, then it is possible for different resolvers to refer to quite different things for the same global IRI reference, correct? So there could be two different IRIs with the same path component
http://example1.org/akn/country/act/X/language@/og.xml
http://example2.org/akn/country/act/X/language@/og.xml

with quite different content. There seems to be an assumption that by some mechanism, there will be designated one or more domains which are guaranteed to have authoritative content for a particular path, while for other domains, it is "buyer beware". And that mechanism to designate which domains may be trusted for which paths is outside the scope of ANNC. Is this correct?

Assuming I have the correct idea now, what is missing for me in the document is guidance (which could be informative rather than normative) on how to refer to Akoma Ntoso documents, or parts thereof, externally - from non-Akoma Ntoso documents, that are not served from the same domain as the AKN document. As you say, an absolute IRI is required, so in this case the domain must be presented somehow and expressed in the reference. In the example you provided of the resolver at http://akresolver.cs.unibo.it/, one of the demos there results in the absolute IRI

http://akresolver.cs.unibo.it/akn/uk/act/pga/2014/2/eng@2014-01-30

Neither the rendered page nor the source of that page show me the Akoma Ntoso XML, but my understanding is that if this were completed, then the resulting absolute IRI denotes some piece of Akoma Ntoso XML (as a manifestation), whereas the original IRI denotes the _expression_.

If I have all this straight, then I would suggest that something like the following be added to the document for clarification purposes, most logically in the Scope section.

"The scope of this specification is the syntax and semantics of the path components of IRIs that refer to Akoma Ntoso concepts and resources on the internet. In keeping with best practices of the Web, the identifier of an Akoma Ntoso concept or resource is always an absolute IRI. Although the authority portion of such an IRI conveys significant information concerning the provenance of the concept or resource, the details of this are outside of the scope of the ANNC specification."

However, the following paragraph was/is still particularly confusing to me.
" Since it is a requirement of Akoma Ntoso that all existing FRBR items of a Manifestation be  byte-per-byte identical to each other, it is a natural consequence that it is not abstractly relevant which resolution engine dereferences the actual Item whose IRI is resolved out of a Work-level, an _expression_-level, or a Manifestation-level IRI reference. This, in practice, means that protocol and authority are, in resolution, not contributing information, and are thus interchangeable."

First, I didn't see anywhere (else) the requirement stated "all existing FRBR items of a Manifestation be  byte-per-byte identical to each other"

When you talk about all existing FRBR items of a Manifestion, I think of "a Manifestation" being identified by an absolute IRI, so a different absolute IRI having the same path component could identify a different Manifestation (of the same _expression_), which would naturally have a different Item as well. So it appears to me that the authority component of the IRI is extremely important. The only way it could be considered irrelevant is if there really is a guarantee of a unique Manifestation for a given path.

In the Manifestation section, I do see

" Therefore, different manifestations of the same _expression_ generated using different data formats correspond to different manifestations and will have different IRIs."
I will assume that "different IRIs" here means "different global IRI references"

This doesn't address the case when different manifestations use the same data format, but, say, are prepared by different authors and so have different content. There seems to be an implication that these authors will coordinate among themselves when they name their manifestations so that there are no name clashes. I don't see how it can be guaranteed there will never be such name clashes. This is exactly the significance of the authority portion of the IRI. Since the dispensing of ownership of domains is handled by a central authority, this serves to distinguish between IRIs that have identical path components.

The last sentence of that paragraph
"Any party interested in absolute IRIs for Akoma Ntoso are required to produce their own resolution engine and use its protocol and authority for the purpose."
seems to be phrased quite poorly. "Any party interested in absolute IRIs for Akoma Ntoso" - so if I simply want to use an absolute IRI to refer to an Akoma Ntoso thing, then I have to build a resolution engine? This can't be what you mean to say. It really comes across as very dismissive of anyone who would care to use (or is even interested in!) the absolute IRIs in an external application. Yet you have agreed that only the absolute IRI is the true identifier.


Tara

On 6/5/15 6:40 AM, Fabio Vitali wrote:
Dear Tara, 

thank you for your comments. Please find here a short discussion of the points you raise. I will definitely welcome your opinion on my answers. 

On 04/giu/2015, at 18:52, Tara Athan <taraathan@gmail.com> wrote:

The document states that International Resource Identifier as per RFC 3987 (http://tools.ietf.org/html/rfc398) is a normative reference. Therefore the document should use the terminology introduced in RFC 3987 according to the definitions provided in RFC 3987.

In Section 4.2, the following is stated:
"According to the authoritative source RFC 3986, all http:// IRI references are divided into absolute IRI and relative IRI references. "

One must be careful here. RFC 3986 defines URIs, not IRIs.

The precise quote from RFC 3987 is

"IRI reference: Denotes the common usage of an Internationalized Resource Identifier. An IRI reference may be absolute or relative. However, the "IRI" that results from such a reference only includes absolute IRIs; any relative IRI references are resolved to their absolute form. "

Based on this definition, there are two cases of IRI references:
1. an IRI reference that is absolute (also called an absolute IRI, or simply IRI)
2. an IRI reference that is a relative (also called a relative IRI reference)
RFC 3987 only uses the phrase "absolute IRI" once, for emphasis. Elsewhere, it simply uses IRI.

This is supported by the ABNF rule from 3987

IRI-reference  = IRI / irelative-ref

In contrast, Akoma Ntoso Naming Convention states

"In the following we will call all IRI
references as simply IRI (they are all references, after all), and distinguish between absolute IRIs, global IRIs and local IRIs."

RFC 3986 uses the terminology  "absolute-path reference"/ "relative-path reference" for a relative URI reference that starts/doesn't start with a slash.

RFC 3987 doesn't use the term "absolute-path reference" directly, but does reuse the ABNF rule:
ipath-absolute = "/" [ isegment-nz *( "/" isegment ) ]


I appreciate the motivation for the introduction of the modifiers "global" and "local" to distinguish this partition of IRI references, as the reuse of "absolute" and "relative" to refer to paths may lead to confusion relative to absolute IRIs and relative IRI references. Further these terms are not used in RFC 3987, only in RFC 3986, so technically an absolute-path reference is a URI reference, not an IRI reference. 

The main problem with the terminology of "global IRI" and "local IRI" is that these are not IRIs, they are IRI references, syntactically speaking.
While the terms "IRI" and "IRI" reference are often (incorrectly) used interchangeably in informal usage, in an OASIS standard these terms should be used correctly, formally, and not interchangeably.

Suggestions:
1. replace all occurrences of "global IRI" or "global IRI ref" with "global IRI reference"
2. replace all occurrences of "local IRI" or "local IRI ref" with "local IRI reference"
I appreciate and concur with this request. In origin we decided for readability purpose that "we will call all IRI references as simply IRI", but since your reading proves that this is actually a source of confusion rather than a simplification, we should definitely get rid of the ambiguity. 

Let me point out that, except for explanatory and introductory textual material, there is no room in the ANNC for IRIs and URIs. We only use IRI references. 

As for the subtle distinction between URI references and IRI references, as you point out the RFC3987 is somehow lacking in terminology, basing itself upon 3986. We never intended to use URI references, only IRI references, since our natural audience does not necessarily use Latin scripts. As such we strived to avoid using URI or URI references anywhere, and used IRI and IRI references instead. We stand by our choice of using for IRI references those terms and productions that have been used for URI references and that can be used for IRI references unchanged. 

There is an additional aspect to this whole IRI/IRI reference issue, that is somewhat obfuscated by the misuse of terminology described above. Using "IRI" to refer to what are actually IRI references hides the fact that there is still a missing piece of information that must be obtained from somewhere in order to resolve this "global IRI" to an actual IRI - the domain name (and port, if appropriate). The Akoma Ntoso XML schema guarantees that this additional information cannot be specified explicitly in the Akoma Ntoso document - there is no xml-base attribute, or any other component that serves this purpose, and all IRI references are required to be relative.
Correct. 

I am especially concerned about this statement
"Any party interested in absolute IRIs for Akoma Ntoso are required to produce their own resolution engine and use its protocol and authority for the purpose."
I suppose this may be a practical necessity, given that there can be no real guarantee that a particular domain name is always owned by the legal system. However, it does depart from the ideal of the Web - that (absolute) IRIs are the identifiers. I would like to see the motivations and ramifications (e.g. security concerns) of this decision discussed in detail in the Standard, as it is setting a very significant precedent. 
I believe this is where our opinion start to differ. We do not see how this can be a departure from the ideal of the web. Absolute IRIs still are the identifiers. What we are discussing here is not identifiers. It's references. 

What we are saying is that legal and legislative documents never refer to physical documents somewhere on the web, but always to abstract ideas of documents, whose physical representations (i.e., files) abound in time, location, content AND ownership. This means that resolution is far from an occasional occurrence that can be avoided by specifying more metadata, but rather it is the fundamental mechanism through which the legal referencing mechanism works. We introduced and relied from the ground up on FRBR exactly for this reason. FRBR is VERY IMPORTANT for the Akoma Ntoso Naming Convention. 

In particular, you (almost) NEVER specify Item-level absolute IRIs within Akoma Ntoso documents (there are exceptions, but they are in fact exceptions, there is a special markup for them and they do not affect the ANNC). You only specify Work-level, _expression_-level, and, in a few and very specific contexts, Manifestation-level IRI references. 

Resolution of a legal reference, given our FRBR setting, is therefore composed of two separate processes: 
1) given an (incomplete) Work-level or _expression_-level IRI reference, the identification of the (complete) Manifestation-level IRI reference that best matches it. I call this the *completion* phase of the resolution.  
2) the identification of the absolute IRI of (one of) the physical files that embody the identified Manifestation. I call this the *mapping* part of the resolution. 

The Akoma Ntoso Naming Convention DOES NOT DISCUSS the mapping. An AN reference is completed into a Manifestation-level IRI reference. That's it. We do NOT deal with identifiers of physical files. 

For example, suppose I decide to publish something at http://athant.com/akn/ke/debaterecord/2011-06-10/main that is similar, but not identical, to the example at http://docs.oasis-open.org/legaldocml/akn-core/v1.0/csprd01/part2-specs/examples/ke_Debate_Bungeni_2011-06-10.xml . Since this is my own domain, I am free to publish what I like there. But the path suggests that I am publishing a copy of the legal code with the identifier "/akn/ke/debaterecord/2011-06-10/main". This seems like a ripe opportunity for spoofing.
I believe this is the core of your objections, and it is were I disagree most strongly. 

AKoma Ntoso IS NOT INTERESTED in having direct resolution of IRI references to physical IRIs, so it does not (by design) guarantee correctness in the second process, which is a further resolution step which we cannot and do not want to control. This step is not part of the ANNC context, and this is the reason we write "Any party interested in absolute IRIs for Akoma Ntoso are required to produce their own resolution engine and use its protocol and authority for the purpose."

Coming to your example, since "/akn/ke/debaterecord/2011-06-10/main" is NOT a Manifestation-level IRI reference (in fact, it looks like a Work-level reference) an Akoma Ntoso resolver will be able to provide you with a working Manifestation-level IRI reference, say, "/akn/ke/debaterecord/2011-06-10/main/eng@/klr.xml". This is where the scope of the Akoma Ntoso Naming Convention ends. In this example, klr may refer to Kenya Law Report, an entity entrusted by the Kenyan government to be the authoritative source for a specific manifestation of the work. 

Physical documents are identified by exactly ONE Item-level absolute IRI, web-like to the fullest. They are referred to by an open set of IRI references, here again very much web-like. Akoma Ntoso only specifies how these IRI references look like. It does NOT claim they are IRIs (except as a shorthand, and I already agreed we should get rid of this habit), and it is not interested in the final step of the resolution (Manifestation IRI reference -> Item-level absolute IRI). 

So what should we say of protocol, domain and port of the IRI references? That they will always the same as the base, whichever they might be. We are just saying that the reference you should find in an Akoma Ntoso document (that MAY omit the protocol being IRI references) MUST omit the protocol and also MUST omit domain and port, and rely on the corresponding ones of the base. 

I don't see how these specifications are not web-like. 

The above scenario contradicts the statement "This, in practice, means that protocol and authority are, in resolution, not contributing information, and are thus interchangeable.", which makes the assumption that the resource identified by the IRI having the given absolute path is the same as the official document. This is in fact the primary role of the "authority" component of an IRI - to establish a certain level of trust.
"Official document" is not a term we use. I do not have any idea of what an official document is. Surely this is not what legal references point to. 

This is among the finest and most delicate concepts that Akoma Ntoso deals with. Legal references are NOT to physical documents, neither official nor authoritative nor public nor private nor spoofing ones. They are references to abstract entities, that exist in a conceptual domain. 

In fact, quite often resolving to "official documents", whatever that might be, is the wrong choice. Suppose for instance that you have an act X, dated 2010. This act has been modified by act Y and act Z in 2012 and 2014, respectively. What has been modified? The physical file on the server of the Parliament of the country? Most probably not (hopefully not). The abstract idea of act X has been modified. Often this is purely virtual, i.e. no-one has taken the task of authoritatively updating the text of act X to reflect the changes introduced by act Y and act Z. Sometimes private actors do it, sometimes no-one does it. There is NO "official document" of the latest version of a modified act. 

So if you have a reference of, say, 

/akn/country/act/X     (Work-level IRI reference),

what is the best destination? Probably not the "official document", which possibly only the earliest version, i.e., it is an outdated version that is most likely not valid for your case... Possibly you must look at other sources of updated versions of the acts (we call them consolidated versions). They are often NON AUTHORITATIVE. Use them at your risk. There might exist NO authoritative consolidated version of the document you are looking for. Very easily. 

And in our case there are three different versions (original one, after modifications of Y, after modifications of Y and Z), with three different contents each of which matches your reference (there would be more if we were in a multi-language country such as the European Union). The only thing we can do is to find, in some fashion, the best _expression_ for you (say, today's version in some language), obtaining 

/akn/country/act/X/language@     (_expression_-level IRI reference), 

which again says very little, because we could have multiple data formats and MOST IMPORTANTLY we could have multiple sources, some of which authoritative, other quite reliable, other irremediably bad. These are reflected in the next step of the completion, where surely the data format, and possibly the source, are identified, so you may obtain: 

/akn/country/act/X/language@/og.xml     (manifestation-level IRI reference) 

where og may stand for "official gazette" or it could be the identifier of another, less authoritative source. 

This is where the scope of the Akoma Ntoso Naming Convention ends and where the control is left to the usual web mechanisms for resolution: you obtain a Manifestation-level IRI reference for your Work-level IRI reference. Since "it is a requirement of Akoma Ntoso that all existing FRBR items of a Manifestation be  byte-per-byte identical to each other", we are not further interested in finding the "right" item, because they are out of scope for the ANNC. 

What we need now is a resolver that maps your Manifestation-level IRI reference into an Item-level absolute IRI, and then we're good. AND IT IS THIS LAST STEP where spoofing may occur, but this step is well outside the scope of the Naming Convention. 

My feeling is that these things that are called "global IRI" and "local IRI" should not even be considered IRI references, since there is no set method for resolving them, so they don't refer to any particular IRI.
THERE IS a method for resolving them. It is a STANDARD method for resolving IRI references: you look at the base and use the items from the base to fill in for the missing details. 

In fact, resolvers exist TODAY (e.g., http://akresolver.cs.unibo.it/ ) that do exactly this using nothing weirder that a plain old browser.

These strings are in fact the identifiers in and of themselves.
They are not identifiers. They are references. The way you call them (references) and the way they call themselves (identifiers) are rather different and there is a resolution process in between. So they are NOT the same and cannot be. We do NOT allow document identifiers (i.e., IRIs) within Akoma Ntoso documents (with a few clearly marked exceptions), and only allow references (i.e., IRI references). 

They have a certain resemblance to IRIs/IRI references, which is a convenience for processing, but they do not satisfy the definition of an IRI syntactically nor do they match the concept of an IRI reference from a functional perspective, and so shouldn't be considered as either. Why not call them something else, e.g. "global aknID", "local aknID", to make this clear?

Suggestions (preferred):
1'. replace all occurrences of "global IRI" or "global IRI ref" or "global IRI reference" with "global aknID", or something similar.
2'. replace all occurrences of "local IRI" or "local IRI ref" or "local IRI reference" with "local aknID", or something similar.
For what I said, I conclude that I object strongly to this suggestion and would rather keep the text as it is. I believe that our design and our systems are very much in line with the current web as it is now, and we call our references as IRI references because they are. 

A couple of minor clarification in 4.2
3. clarify that only IRIs in the http scheme start with "http://"
" An absolute IRI starts with the string “http://”
should be 
"An absolute IRI in the http scheme starts with the string “http://”
I agree with this correction. 

4. There are in fact network-path references (RFC 3986) that start with "//" and include the authority but not the scheme.
" A relative reference, on the other hand, has no indication of the scheme, no indication of the domain name, and may ..."
should be
" A relative reference, on the other hand, has no indication of the scheme, and may"
I agree with this correction. 

5. Clarify that IRI references are independent of the resolution mechanism, but that the resolved IRIs are (highly) dependent on this mechanism
"This makes all IRI references independent of the actual resolution mechanism, and allows for very flexible storage, access, and reference mechanisms. On the other hand, the resolved IRI is dependent on the resolution mechanism, which may supply the missing information from an arbitrary base IRI."
Let me repeat myself again here. The resolved absolute IRIs are NOT part of the Akoma Ntoso Naming Convention, which is only interested in the determination of the most appropriate Manifestation-level IRI reference obtainable given a different-level IRI reference within the ANNC. 

The final resolution (if you prefer, the ACTUAL resolution) takes place using standard web mechanisms for IRI references, including relying on the base for the provision of protocol, domain and port of the resolved IRI. 

It is web-like, it works, it does NOT break any ideal, mechanism, or tool of the Web. 

6. Typos
"In fact, this is a simplification of RFC 3986, that calls global IRI refs as “absolute path references” and local IRI refs as “relative path references”. "
should be
"In fact, this is a simplification of RFC 3986, that calls global IRI references as “absolute-path references” and local IRI refs as “relative-path references”."
I agree with this correction. 

7. Don't use "ref" as a word.
I agree with this remark. 

8. Proper attribution of IRI terminology to RFC 3987, and avoid usage of "absolute IRI reference", which does not appear anywhere in RFC 3987. Further, the scheme is irrelevant.

"According to the authoritative source RFC 3986, all http:// IRI references are divided into absolute IRI and relative IRI references."
should be
"According to the authoritative source RFC 3987, all IRI references are divided into absolute IRIs and relative IRI references."
I agree with this correction. 

Thank you for your constructive remarks, and if in the things I wrote here you find a way to further improve the text of the naming convention, please let us know. 

Best regards

Fabio Vitali

--

Fabio Vitali                                          The sage and the fool
Dept. of Informatics                                     go to their graves
Univ. of Bologna  ITALY                               alike in this respect:
phone:  +39 051 2094872                  both believe the sage to be a fool.
e-mail: fabio@cs.unibo.it                  Where, then, may wisdom be found?
http://vitali.web.cs.unibo.it/   Qi, "Neither Yes nor No", The codeless code





[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]