OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

legaldocml message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Retrofitting, with a foray into levels and labels


Folks:

I should probably be a little more explicit about the perspective that leads me to raise some of the questions that I've been asking. I'm anxious for AkomaNtoso to be a well-developed and widely-adopted standard, and I believe that in order for that to happen some bulletproofing is needed. Some of that need is technical, and some of it is essentially political. I believe that AkomaNtoso was conceived originally as a standard to be implemented in jurisdictions with little or no prior commitment to any XML or SGML standard, and largely in jurisdictions where it would be inserted into the process at inception -- that is, with legislative and regulatory drafters in a setting where it would then form a lifecycle system for legislative documents from cradle to grave.

Are there any differences between what you'd do in a cradle-to-grave system and what you'd do in a post-hoc publishing system? There's reason to be suspicious, just as there always is when people start asserting exceptionalism in a standards-building process. Our job here is, in some sense, to discover and sanctify similarities, and that's a difficult process that's hard for people used to doing things their own way to wrap their heads around. But there are exceptions that are legitimate. And unfortunately, post-hoc publishers have to accept them, along with a host of other suboptimal ways of doing things.

All that to say that I think we need to pay careful attention to problems of modeling and encoding things as we find them in the wild, where they are often a mess. We look for commonalities and simplicity, but we often have to introduce some complexity in order to cover semi-exceptional cases. And potential adopters are always very quick to say "that doesn't look like *my* data".

A second question of design philosophy has to do with the ultimate product. I got to thinking: is our goal here *only* the simple encoding of what the legislature said, or are we also trying to make the XML document a point of departure for a wider range of use cases? That becomes a confusing question because (whether we are aware of it or not) the use cases that we have at the back of our minds when we're working on these things are mostly about search, and the question then becomes whether we're encoding textual features we can use to make facets for searching. There are other questions we might ask, having to do with applications in which a snippet of legislative text is embedded in something else (say, a web page that says "here is the most current version of Section X"). Or post-hoc validation of documents that have been transformed into AkomaNtoso and not born Akomish. And so on.

Which brings me to my Qu(estion|ibble) of the Week: the business of using arbitrary, locally determined strings as names for elements that encode structure. It seems to me that this creates a situation in which documents can't be validated, but I may be confused, so:

a) If a particular AN user/publisher decides to enforce a hierarchy of elements in which , eg., "subtitle" is only legitimate within "title", or "section" is only legitimate within "part", how is that done?

b) The million-dollar question for the US Code: if the hierarchy of elements is different within one partition of a corpus from what it is within others, can AN accomodate that and still support validation for the corpus as a whole? Here's why I ask. In most Titles of the US Code, the Part element would be contained within the Chapter element. That isn't true in Title 38; it's the other way around. Could AN accomodate this, and still validate the Code as a whole? (that's not the only such variation, by the way, and the CFR introduces a whole other can of worms, including "anonymous" levels and many, many exceptions to rules about what can be legitimate children of what).

If not, what would the objection be to a system like this: <level1 label="title"> and <level2 label="subtitle"> or just to be really wild and crazy <level n=1 label="title"> <level n=2 label="subtitle">? Less intuitive, sure, but a lot more flexible in what it maps to what, and how it can structure things and still validate.

I think I have more questions about how IDs work, but a first one would be ... are they essentially just opaque strings so far as AN is concerned? Some look like they embed structural semantics, at least for convenience/brain-compatibility.

All the best,
T.

--
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
Thomas R. Bruce @trbruce
Director, Legal Information Institute
Cornell Law School
http://www.law.cornell.edu/ @liicornell
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+




[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]