OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

legaldocml message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [legaldocml] Retrofitting, with a foray into levels and labels


Dear Tom, 

I thank Flavio for his (to me) convincing recapitulation of the history and philosophy of Akoma Ntoso regarding flexibility in support of local features of the language. 

I also do not want to keep on bragging about who has the wackiest legal system in the world, so I'll stop saying "in Italy we have seen worse things that you could possibly dream in your worst nightmare", even though I am strongly convinced that this is true. So I'll also stop repeating that Akoma Ntoso CAN support what you fear it cannot, even though it actually can.  

> I should probably be a little more explicit about the perspective that leads me to raise some of the questions that I've been asking. I'm anxious for AkomaNtoso to be a well-developed and widely-adopted standard, and I believe that in order for that to happen some bulletproofing is needed.  Some of that need is technical, and some of it is essentially political.  I believe that AkomaNtoso was conceived originally as a standard to be implemented in jurisdictions with little or no prior commitment to any XML or SGML standard, and largely in jurisdictions where it would be inserted into the process at inception -- that is, with legislative and regulatory drafters in a setting where it would then form a lifecycle system for legislative documents from cradle to grave.

This is in part true. But I would distinguish two situations. On the one hand, we were aware that there would be situations where a significant experience in legislative XML was present before us. And on the other hand, we were also perfectly aware that each country has its own significant past experiences and traditions in legislative processes and that their documents could be similar in surface, yet importantly different in depth. 

For what regards compatibility with other XML vocabularies for legislative documents, the short answer is CEN Metalex. CEN Metalex was exactly created so as to allow legislative offices in different countries that already adopted some XML technologies to start talking to each other. To me, alignment of Akoma Ntoso with CEN Metalex is a strong requirement, as it is the way to guarantee that we have some form of interoperability with competing standards. But Akoma Ntoso is not a meta-markup language for legislative documents, rather it is an actual markup language, which is necessarily different from one's own national XML standard.  

For what regards different legislative processes and legislative documents, on the other hand, Akoma Ntoso was explicitly designed so as to allow for the widest variety of situations to be described. This is what goes under the umbrella topic of descriptiveness vs. prescriptiveness in the early discussions about the design issues of Akoma Ntoso. AN tries to be as descriptive as possible, and imposes as little requirements on the content and structure of actual documents as possible. AN has a long list of requirements on the metadata and the organization of the XML markup, but almost none whatsoever on the organization of the content, about which it tries to be as agnostic as possible. 
 
> Are there any differences between what you'd do in a cradle-to-grave system and what you'd do in a post-hoc publishing system?  There's reason to be suspicious, just as there always is when people start asserting exceptionalism in a standards-building process.  Our job here is, in some sense, to discover and sanctify similarities, and that's a difficult process that's hard for people used to doing things their own way to wrap their heads around.  But there are exceptions that are legitimate.  And unfortunately, post-hoc publishers have to accept them, along with a host of other suboptimal ways of doing things.
> 
> All that to say that I think we need to pay careful attention to problems of modeling and encoding things as we find them in the wild, where they are often a mess.  We look for commonalities and simplicity, but we often have to introduce some complexity in order to cover semi-exceptional cases.  And potential adopters are always very quick to say "that doesn't look like *my* data".
> 
> A second question of design philosophy has to do with the ultimate product.  I got to thinking: is our goal here *only* the simple encoding of what the legislature said, or are we also trying to make the XML document a point of departure for a wider range of use cases?

Absolutely. The idea of the XML being a point of departure to new uses is the reason why I insist on providing a rich set of descriptive elements and attributes that allow MUCH MORE than on-screen or on-paper display of the marked up documents. Most of the elements we will be discussing have little or no effect on presentation, yet are important to allow new uses to documents. 
 
>  That becomes a confusing question because (whether we are aware of it or not) the use cases that we have at the back of our minds when we're working on these things are mostly about search, and the question then becomes whether we're encoding textual features we can use to make facets for searching.  There are other questions we might ask, having to do with applications in which a snippet of legislative text is embedded in something else (say, a web page that says "here is the most current version of Section X").   Or post-hoc validation of documents that have been transformed into AkomaNtoso and not born Akomish.  And so on.

I am not sure what you are saying here, and how a document transformed into Akoma Ntoso could be different from a natural-born one...

> Which brings me to my Qu(estion|ibble) of the Week:  the business of using arbitrary, locally determined strings as names for elements that encode structure.  It seems to me that this creates a situation in which documents can't be validated, but I may be confused, so:
> 
> a) If a particular AN user/publisher decides to enforce a hierarchy of elements in which , eg., "subtitle" is only legitimate within "title", or "section" is only legitimate within "part", how is that done?

The short answer is that yes, it can be done, and it can be done by way of a restriction (i.e. a subschema providing more constraints than the main one). More info at http://www.akomantoso.org/docs/localisation-of-akoma-ntoso/ . A long answer is at the end of this message, at note [1], as I fear I have become way too technical in my explanation. 

To explore a little bit the conceptual space within which your question is positioned, though, Akoma Ntoso does not dictate or enforce any hierarchical structure. In fact, it provides only a list of common terms for hierarchical terms that you can use at your leisure. BTW, a word of caution: enforcing some structures is appropriate in prescriptive schemas, and not in descriptive ones, and it makes sense only if you can impose your restrictions to your authors, i.e., if you need a restricted schema to verify that they are actually working according to your demands. This is rarely the situation for private publishers (who rarely have the authority to go to the local Parliament and demand that legislation is written so as to follow his guidelines), and more often by the legal drafting offices that may verify that the legislation is actually drafted according to the self-impsed guidelines. 

> b) The million-dollar question for the US Code:  if the hierarchy of elements is different within one partition of a corpus from what it is within others, can AN accomodate that and still support validation for the corpus as a whole?

Yes. In fact, this is the default approach of Akoma Ntoso: allowing for exceptions can be done without effort, while checking for conformance to specific, local regularities in the structure requires a subschema, as shown before. 

>  Here's why I ask.  In most Titles of the US Code, the Part element would be contained within the Chapter element.  That isn't true in Title 38; it's the other way around.   Could AN accomodate this, and still validate the Code as a whole? (that's not the only such variation, by the way, and the CFR introduces a whole other can of worms, including "anonymous" levels and many, many exceptions to rules about what can be legitimate children of what).

Yes. The only caveat here is in the "anonymous" levels, which I do not believe are really anonymous, and that are called in some way among the cognoscenti. There would be the risk of giving them a name and its well-known name is different, or even to give them no name, when a well-known name exists.  


> If not, what would the objection be to a system like this: <level1 label="title"> and <level2 label="subtitle"> or just to be really wild and crazy <level n=1 label="title"> <level n=2 label="subtitle">?  Less intuitive, sure, but a lot more flexible in what it maps to what, and how it can structure things and still validate.

Generic elements are possible in Akoma Ntoso, but they are not elegant. <level1 label="title"> is less elegant than <title>, as you mention is less intuitive, but I don't see exactly how more flexible it would end up being. For instance, I would still be able to create a (although more complex) Schematron for this example, but there would definitely be no subschema possible to validate it. 

> I think I have more questions about how IDs work, but a first one would be ... are they essentially just opaque strings so far as AN is concerned?  Some look like they embed structural semantics, at least for convenience/brain-compatibility.

Errr, not exactly. The syntax of IDs are expected to follow some (currently unenforced) guidelines that, as you say, contain structural semantics. My idea is that we will start to provide enforceable rules, rather than guidelines, as soon as we are psychologically ready. It is technically easy, but quite heavy on the impact on the schema, at the moment. 

Ciao

Fabio

--

[1] This is the more technical answer to question: 

> a) If a particular AN user/publisher decides to enforce a hierarchy of elements in which , eg., "subtitle" is only legitimate within "title", or "section" is only legitimate within "part", how is that done?

Enforcing these rules means reducing the set of valid document to a subset of those valid for the whole schema. This is possible, in XML Schema, by way of a derived schema, i.e., a new schema that refers to the main one and introduces new schema structures. Since in this case this derived schema reduces the set of allowable content models for some elements, it is called a restriction. In XML Schema there are basically two ways to do derivations: Redefine and Schematron. They are both minimal XML schema that include the main Akoma Ntoso schema and provide additional constraints.

Schematron is a different language than XML Schema, and can be embedded within an XML schema as annotations. It is exactly appropriate for the kind of constraint you want to specify, i.e., a very limited rule that needs to be checked punctually. The complete XML Schema that validates what you expressed is as follows: 

<?xml version="1.0" encoding="UTF-8"?>
<xsd:schema xmlns="http://www.akomantoso.org/2.0"; xmlns:xsd="http://www.w3.org/2001/XMLSchema";
  targetNamespace="http://www.akomantoso.org/2.0"; elementFormDefault="qualified">

  <xsd:include schemaLocation="./akomantoso20.xsd" />
  
  <xsd:annotation xmlns:sch="http://purl.oclc.org/dsdl/schematron";>
      <xsd:appinfo>
        <sch:ns uri="http://www.akomantoso.org/2.0"; prefix="an"/>
        <sch:pattern id="section">
          <sch:rule context="an:section">
            <sch:assert test="parent::an:part">Section can appear only within a part element</sch:assert>
          </sch:rule>
        </sch:pattern>
        <sch:pattern id="subtitle">
          <sch:rule context="an:subtitle">
            <sch:assert test="parent::an:title">Subtitles can appear only within a title element</sch:assert>
          </sch:rule>
        </sch:pattern>
      </xsd:appinfo>
    </xsd:annotation>
</xsd:schema>

This includes the main schema ( <xsd:include schemaLocation="./akomantoso20.xsd"/> ) and then introduces two rules, one that checks that sections only appear within part elements <sch:assert test="parent::an:part"> and that subtitles appear only within title elements <sch:assert test="parent::an:title"> 

The Redefine approach uses only the XML Schema language, and allows you to redefine the content models of the elements that the main schema defines. Your constraints are not easily expressible in XML Schema, as they are not complete requirements (what other elements besides subtitle can you have in title? etc.), so I offer here an even narrower solution, a rigorous hierarchy of title -> subtitle -> part -> section -> paragraph, which is consistent with what you ask but much stricter, but in case you want to provide more information about the allowed content in each element, it can be easily done (by the way, if you have come this far in reading this explanation, I would appreciate if you dropped me a line in acknowledgement, so that I know that there is someone who is willing to read through pages and pages of technical copy in order to follow some reasonings). It's a fairly long schema, because rather than having a single type for title, subtitle, part, section and paragraph, I had to create individual types for each, each subtypes of hierarchy with a different group of allowed elements inside. 

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"; elementFormDefault="qualified"
  xmlns:an="http://www.akomantoso.org/2.0"; xmlns="http://www.akomantoso.org/2.0";
  targetNamespace="http://www.akomantoso.org/2.0";>
  <xsd:redefine schemaLocation="./akomantoso20.xsd">

    <xsd:complexType name="bodyType">
      <xsd:complexContent>
        <xsd:restriction base="an:bodyType">
          <xsd:choice minOccurs="1" maxOccurs="unbounded">
            <xsd:element ref="componentRef" />
            <xsd:group ref="ANhier1" />
          </xsd:choice>
          <xsd:attributeGroup ref="coreopt" />
        </xsd:restriction>
      </xsd:complexContent>
    </xsd:complexType>
  </xsd:redefine>

  <xsd:group name="ANhier1">
    <xsd:choice>
      <xsd:element name="title" type="h1" />
    </xsd:choice>
  </xsd:group>

  <xsd:complexType name="h1">
    <xsd:complexContent>
      <xsd:restriction base="hierarchy">
        <xsd:sequence>
          <xsd:sequence>
            <xsd:element ref="num" minOccurs="0" maxOccurs="1" />
            <xsd:element ref="heading" minOccurs="0" maxOccurs="1" />
            <xsd:element ref="subheading" minOccurs="0" maxOccurs="1" />
          </xsd:sequence>
          <xsd:choice>
            <xsd:sequence minOccurs="0" maxOccurs="1">
              <xsd:element ref="intro" minOccurs="0" maxOccurs="1" />
              <xsd:choice minOccurs="1" maxOccurs="unbounded">
                <xsd:element ref="componentRef" />
                <xsd:group ref="ANhier2" />
              </xsd:choice>
              <xsd:element ref="wrap" minOccurs="0" maxOccurs="1" />
            </xsd:sequence>
            <xsd:element ref="content" minOccurs="0" maxOccurs="1"/>
          </xsd:choice>
        </xsd:sequence>
        <xsd:attributeGroup ref="corereq" />
      </xsd:restriction>
    </xsd:complexContent>
  </xsd:complexType>

  <xsd:group name="ANhier2">
    <xsd:choice>
      <xsd:element name="subtitle" type="h2" />
    </xsd:choice>
  </xsd:group>

  <xsd:complexType name="h2">
    <xsd:complexContent>
      <xsd:restriction base="hierarchy">
        <xsd:sequence>
          <xsd:sequence>
            <xsd:element ref="num" minOccurs="0" maxOccurs="1" />
            <xsd:element ref="heading" minOccurs="0" maxOccurs="1" />
            <xsd:element ref="subheading" minOccurs="0" maxOccurs="1" />
          </xsd:sequence>
          <xsd:choice>
            <xsd:sequence>
              <xsd:element ref="intro" minOccurs="0" maxOccurs="0" />
              <xsd:choice minOccurs="1" maxOccurs="unbounded">
                <xsd:element ref="componentRef" />
                <xsd:group ref="ANhier3" />
              </xsd:choice>
              <xsd:element ref="wrap" minOccurs="0" maxOccurs="1" />
            </xsd:sequence>
            <xsd:element ref="content" />
          </xsd:choice>
        </xsd:sequence>
        <xsd:attributeGroup ref="corereq" />
      </xsd:restriction>
    </xsd:complexContent>
  </xsd:complexType>

  <xsd:group name="ANhier3">
    <xsd:choice>
      <xsd:element name="part" type="h3" />
    </xsd:choice>
  </xsd:group>

  <xsd:complexType name="h3">
    <xsd:complexContent>
      <xsd:restriction base="hierarchy">
        <xsd:sequence>
          <xsd:sequence>
            <xsd:element ref="num" minOccurs="0" maxOccurs="1" />
            <xsd:element ref="heading" minOccurs="0" maxOccurs="1" />
            <xsd:element ref="subheading" minOccurs="0" maxOccurs="1" />
          </xsd:sequence>
          <xsd:choice>
            <xsd:sequence>
              <xsd:element ref="intro" minOccurs="0" maxOccurs="0" />
              <xsd:choice minOccurs="1" maxOccurs="unbounded">
                <xsd:element ref="componentRef" />
                <xsd:group ref="ANhier4" />
              </xsd:choice>
              <xsd:element ref="wrap" minOccurs="0" maxOccurs="1" />
            </xsd:sequence>
            <xsd:element ref="content" />
          </xsd:choice>
        </xsd:sequence>
        <xsd:attributeGroup ref="corereq" />
      </xsd:restriction>
    </xsd:complexContent>
  </xsd:complexType>

  <xsd:group name="ANhier4">
    <xsd:choice>
      <xsd:element name="section" type="h4" />
    </xsd:choice>
  </xsd:group>

  <xsd:complexType name="h4">
    <xsd:complexContent>
      <xsd:restriction base="hierarchy">
        <xsd:sequence>
          <xsd:sequence>
            <xsd:element ref="num" minOccurs="0" maxOccurs="1" />
            <xsd:element ref="heading" minOccurs="0" maxOccurs="1" />
            <xsd:element ref="subheading" minOccurs="0" maxOccurs="1" />
          </xsd:sequence>
          <xsd:choice>
            <xsd:sequence>
              <xsd:element ref="intro" minOccurs="0" maxOccurs="0" />
              <xsd:choice minOccurs="1" maxOccurs="unbounded">
                <xsd:element ref="componentRef" />
                <xsd:group ref="ANhier5" />
              </xsd:choice>
              <xsd:element ref="wrap" minOccurs="0" maxOccurs="1" />
            </xsd:sequence>
            <xsd:element ref="content" />
          </xsd:choice>
        </xsd:sequence>
        <xsd:attributeGroup ref="corereq" />
      </xsd:restriction>
    </xsd:complexContent>
  </xsd:complexType>

  <xsd:group name="ANhier5">
    <xsd:choice>
      <xsd:element name="paragraph" type="h5" />
    </xsd:choice>
  </xsd:group>

  <xsd:complexType name="h5">
    <xsd:complexContent>
      <xsd:restriction base="hierarchy">
        <xsd:sequence>
          <xsd:sequence>
            <xsd:element ref="num" minOccurs="0" maxOccurs="1" />
            <xsd:element ref="heading" minOccurs="0" maxOccurs="1" />
            <xsd:element ref="subheading" minOccurs="0" maxOccurs="1" />
          </xsd:sequence>
          <xsd:choice>
            <xsd:sequence minOccurs="0" maxOccurs="0">
              <xsd:element ref="intro" minOccurs="0" maxOccurs="0" />
              <xsd:choice minOccurs="1" maxOccurs="unbounded">
                <xsd:element ref="componentRef" />
                <xsd:group ref="ANhier6" />
              </xsd:choice>
              <xsd:element ref="wrap" minOccurs="0" maxOccurs="0" />
            </xsd:sequence>
            <xsd:element ref="content" />
          </xsd:choice>
        </xsd:sequence>
        <xsd:attributeGroup ref="corereq" />
      </xsd:restriction>
    </xsd:complexContent>
  </xsd:complexType>

  <xsd:group name="ANhier6">
    <xsd:choice>
    </xsd:choice>
  </xsd:group>

</xsd:schema>


--

Fabio Vitali                            Tiger got to hunt, bird got to fly,
Dept. of Computer Science        Man got to sit and wonder "Why, why, why?'
Univ. of Bologna  ITALY               Tiger got to sleep, bird got to land,
phone:  +39 051 2094872              Man got to tell himself he understand.
e-mail: fabio@cs.unibo.it         Kurt Vonnegut (1922-2007), "Cat's cradle"
http://vitali.web.cs.unibo.it/






[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]