Fwd: Re: [legaldocml-comment] Various questions regarding representing a

To the LegalDocML technical committee,

I have briefly been in contact previously. I represent a Norwegian privately owned publishing agency (called Universitetsforlaget) who are developing a new online service (called juridika.no) for accessing their legal literature. In relation to this, we need to represent various acts and regulations of Norwegian law, and have chosen LegalDocML as the format with which to do so.

Apologies in advance for an avalanche of questions, they have accumulated over some time.

Incidentally, if there is some more informal way to discuss the use of LegalDocML, or to raise potential issues in the documentation, such as Slack or the like, that might reduce the threshold for contributing/asking questions, so they could come as a trickle instead of a tsunami.

1. Use case specific questions

1.1 Representing an incomplete act history

Q 1.1.1 What are the provisions for representing incomplete information regarding the history of an act?

Q 1.1.2 What is the best/correct way to represent the history of act fragments, without storing the textual content of each historical version?

To put the questions in context, I am working primarily with acts, and the use case I am trying to accommodate at the moment is as follows: Keeping a history, per act fragment (mainly sections), of all (or a subset of) the times when it was textually amended. More specifically, being able to keep track of all changes to all sections, in terms of when (which _expression_), and ideally by which external amendment, they changed, as well as the nature of that change (textual modification, repeal, etc.). However, it must work with incomplete information. We do not (at least at the time being) have the resources to effectively recreate every single _expression_ of an act from the original _expression_ all the way up until the current _expression_, for all the acts in which we are interested. In other words, it should work even when providing metadata from only a strict subset of all the expressions (the latest N expressions of the act). Moreover, it should not require us to provide the actual textual content of the historical versions of a section, although the option to be able to add these later without a significant change in the XML-structure would be beneficial.

I see two different ways to meet most of these requirements whilst complying with the Akoma Ntoso schema.

Option 1. Each section has its own unique temporalGroup, referenced by the section's @period attribute, which contains a timeInterval per textual version of the section (i.e. one for the original wording, and an additional one per amendment that affected that section). Relies on the references, lifecycle and temporalData metadata elements, and the @period of hcontainers.

Pros: simple lookup

Cons: seems like a misuse of the @period attribute on hcontainers, which seems to have been intended for use for contains="multipleVersions" documents; no canonical way to specify some key metadata, such as the type of amendment, whether the fragment was repealed, etc

Option 2. Each _expression_ has its own unique temporalGroup (i.e. one for the original _expression_, and an additional one per amendment), with a single timeInterval (from that _expression_ to the next), and each such temporalGroup has zero or more textualMod elements referencing it through the @period attribute of their force child. Each textualMod has a single source element referencing an amendment (a #-reference to the eId of an eventRef referencing a passiveRef referencing the AKN _expression_-level identifier of the amendment, but likely with no reference to a fragment within the amendment), but may have multiple destination children referencing each act fragment modified by the amendment. If the same amendment performs multiple changes of different natures (e.g. both textual substitution and repeal in the same amendment), one textualMod is created per relevant @type value for the amendment. This metadata is not referenced directly by the sections in the body, but instead references the sections. Relies on the references, lifecycle, analysis and temporalData metadata elements. Could conceivably even forego the temporalData, as all the data about which expressions resulted in which fragments changed is available even without the textualMod's force child.

Pros: Seems more in line with the intended way to express such metadata; allows for more detailed metadata to be specify.

Cons: More complex (= more moving parts = higher chance of inconsistencies/bugs)

Which approach is more correct/in the spirit of Akoma Ntoso? I am guessing the second option would be the most standard-compliant approach, as that is more explicit about the changes to the fragments, whereas the first option does not directly specify the nature (in-force) of the temporalGroup, and seems to misuse the section's @period attribute which seems to only have been intended for acts marked as @contains="multipleVersions" (whereas we use "singleVersion").

1.2 Representing act expressions not in force

Q 1.2.1 Are there any provisions for representing expressions of an act before the _expression_ where the act first entered into force? For example, the Norwegian penal code entered into force a decade after it was promulgated, and a number of amendments affected its draft in the meantime. How could the different expressions of the draft be represented? Would this simply be to use a <bill> document instead of an <act> document, or is there some way to represent the draft expressions without varying the document type before and after the enactment?

Q 1.2.2 Are there any provisions for representing an act which has only partially entered into force, including in the same document both the fragments which have entered into force and those which haven't?

2. Specific schema questions

2.1 eventRef

Q 2.1.1 How to reference the document that generated the event?

eventRef has both attributes @source and @refersTo, as well as @href. All examples I can find seem to use the @source attribute to indicate which document the event corresponds to.

e.g. from Acts/Act_Kenya_1997-08-22_3@2003-12-19.xml in namespace http://www.akomantoso.org/2.0

</lifecycle>

</references>

Same with http://docs.oasis-open.org/legaldocml/akn-core/v1.0/os/part2-specs/examples/us_Act_2011-11-29.xml in namespace http://docs.oasis-open.org/legaldocml/ns/akn/3.0

However, this seems to go against the definition of the source attribute group used by eventRef: "The attribute source links to the agent (person, organization) providing the specific annotation or markup". Doesn't this indicate that this use of @source is incorrect? Given the documentation of @source, I would assume that @refersTo would be a more natural attribute to use for indicating the document generating an event.

Assuming that @refersTo, as opposed to @source, is the correct attribute to use to reference the document the event corresponds to:

what is the point of the @source attribute on an eventRef? The lifecycle element (the parent of eventRef) already has a @source attribute, and there can be several lifecycle elements, so I don't see the point in specifying the source of each eventRef.

Given the description "For each event, [â] a document that generated the event must be referenced.", why isn't @refersTo required, whereas @source is?

Assuming instead that @source is indeed the attribute to use, should not eventRef at least use a different attribute group than the current 'source' attribute group, so the documentation on this attribute is not misleading and wrong?

Moreover, where does the @href attribute come into all this?

2.2 wId

2.2.1 Naming convention

Q 2.2.1.1 Is there a naming convention for how to generate wId values for two different fragments with identical type, hierarchical placement and number, but unrelated content? E.g. if _expression_ A had a section with eId sec_2, and _expression_ B had a different section which also had eId sec_2?

The closest I found was an example eId="art_2v1" from the NC spec, indicating that maybe a v1, v2, v3... versioning scheme should be used. However, v is a roman numeral, which could make this scheme misleading in cases where roman numerals are actually involved in the article numbering. (In my local legislation, we have examples of roman numerals mixed with other counting methods, such as "Chapter III A"). I'm thinking maybe just art_2_1 would be an acceptable alternative?

Q 2.2.1.2 By the way, in this example, the article with eId=âart_2â surely needs an explicit wId as well, as that article can necessarily not have the same (implicit) wId as eId, because wId=âart_2â is already in use by a different article.

2.2.2 Referring to wId instead of eId

Q 2.2.2.1 Sometimes, there is a need to refer to the same fragment over several expressions. This is what the wId is for. However, from my understanding, a #-reference in a href attribute is always interpreted to refer to an eId value, not a wId value. If this is the case, how can one refer to the wId value of a fragment in an _expression_? If this is not the case, how can one disambiguate references to eIds from those to wIds? (Especially important in the case where the element with eId X and the element with wId X are not the same element.)

See the following relevant quote: "eId: this is the string that identifies within the document the entity being described. All internal references will thus use this eId." (emphasis mine)

This question is of extra interest to us, as we attach additional information to act fragments outside of the AKN file, but it needs to be attached to the semantically same fragment across expressions, not just the fragment with coincidentally the same number and eId.

2.2.3 When is a fragment no longer the same?

In terms of textual modifications of a fragment, has any thought been given to when a fragment is still considered the same, and when it is not?

Presumably, a fragment which has merely been renumbered is still considered the same, whereas a fragment which has been repealed and replaced by another with the same number and location (and thus, same eId) is not considered the same. For fragments split into several others, or formed from joining several others, it might be more ambiguous whether one of the several fragments is still considered the same as the single one. But how about for pure textual amendments? How much does the text need to change for the fragment to be considered no longer the same, and should be given a new wId?

Q 2.2.3.1 Are there any guidelines for this, or is this wholly up to the judgement of the manifestation author?

2.3 timeInterval

2.3.1 Exactly two of @start, @end and @duration / incomplete time intervals

The documentation here seems internally inconsistent

timeInterval spec documentation: "The element timeInterval [â] is built either with two dates or with a date and a duration (exactly two of the 'start', 'end' and 'duration' attributes can be specified)."

From vocabulary 5.7 Modifications and versioning: "A @start attribute with no @end attribute marks a fragment that has appeared in an amendment and still exists in the latest recorded version of the document."

duration spec documentation: "The duration attribute is used to specify a duration when either the start or the end event of a time interval is not known."

But there is (to my knowledge) no way to specify an unknown/infinite/ongoing duration.

Q 2.3.1.1 The documentation should be updated to be consistent. What is actually allowed here?

I would like to make the case that it should not be required to provide exactly two of the three attributes @start, @end and @duration.

Specifying two attributes works well for time intervals where the knowledge of the interval is complete: when the start and end date are both known.

However, what about time intervals where the knowledge is incomplete? For example, a time interval which lasts until further notice (such as the most recent amendment of an act, such that the current _expression_ lasts until whenever the next amendment comes into force), i.e. if the end date is as yet undetermined. Or, a time interval in a very old document, where the changes over time are hard to track, where an end date for how a section was worded might be known, but it is unclear when that wording initially came into being, i.e. the start date is unknown.

Time intervals with incomplete information ought to be valid according to the AKN specification. They do come up, certainly where the end date is unknown, whenever a time interval is declared for the final _expression_/amendment of an act.

2.3.2 @refersTo / temporal concepts

From timeInterval spec documentation: "The refers attribute is a reference to a temporal concept belonging to the Akoma Ntoso ontology and specified in the references section"

The temporal concepts I have found in examples have been

/ontology/concept/inforce

/ontology/concept/efficacy

Q 2.3.2.1 However, I have failed to find an exhaustive (or even an incomplete) list of concepts (temporal or otherwise) belonging to the Akoma Ntoso ontology. Does any such list exists? Could one be procured?

2.3.3 @start, @end / inclusive/exclusive

Q 2.3.3.1 Is there any way to know whether the dates of the events referred to by the @start and @end attributes are inclusive or exclusive with respect to the time interval? Has any thought been given to this? Or is it just assumed that the @start is always inclusive and the @end is always exclusive?

2.4 modificationType

2.4.1 Periods

Q 2.4.1.1 What is the intent of the application and duration elements, which is not covered by the force and efficacy elements?

Q 2.4.1.2 Does it make sense to include the force, efficacy, application and duration periods for all modification types?

Q 2.4.1.3 What is the difference between the @period attribute on the modificationType itself (via the enactment attribute group) versus the @period attribute on its children: force, efficacy, application, duration. When are you meant to use which @period attribute?

Q 2.4.1.4 Does the @period refer to the period up until the modification, or the period after the modification? And in either case, what event constitutes the other point in time making up the time interval? Or is the @period somehow meant to be used only for the exact point in time of the modification?

2.4.2 TextualMods

See TextualMods spec documentation.

Q 2.4.2.1 What is the difference between substitution and replacement? When is one used over the other? Is there some ambiguity here?

2.4.3 Enumerations

I miss a description of each enumeration of a metadata attribute (e.g. the different @type ModTypes of passiveModifications). Bear in mind that the audience of this specification is varied, and not everyone will be intimate with all the intricacies of legal jargon. Personally, I am a software developer with no background in law, and the lawyers I can consult with are not necessarily familiar with the English legal jargon, as English is neither their first language nor primary professional language.

This was done very well and thoroughly with statusType, and similar explanation for all the other enumerations would be very helpful in correctly interpreting this specification.

Q 2.4.3.1 Could detailed explanations be added to all metadata enumerations?

2.5 @startEfficacy, @endEfficacy

Attributes @startEfficacy and @endEfficacy are referenced by the vocabulary, but these attributes are not present in the specification, in hcontainers such as <section>. Judging by the specification, they appear outdated, replaced by timeInterval with a refers attribute for the relevant temporal concept. The vocabulary should be updated to reflect this.

2.6 @refersTo vs @href

Q 2.6.1 A single @refersTo attribute can contain several references, space separated, as I understand it. However, this seems not to be the case with @href, which can only contain a single reference/URI, despite both attributes being used to refer to other entities within the same XML document. Is this divergence a conscious design decision, and if so, what was the rationale?

3. General questions

3.1 How to contribute to the spec?

Q 3.1.1 In reading the vocabulary and specification of LegalDocML / Akoma Ntoso, I come across a number of typos, outdated content or minor inconsistencies, which I would address immediately if I could. What is the best way to contribute to improve the quality of the specification and vocabulary? Is it possible to contribute amendments (e.g. fixing typos) and/or raise issues directly to the spec source? It seems the github repository is out of date, not including the OS version of the specification. Also, the part1 vocabulary is out of date in a number of areas.

Best regards,

Anders Thorbeck

Knowit

software developer on juridika.no, by Universitetsforlaget

legaldocml message