legaldocml message

Subject: Chile's debate document
From: Fabio Vitali <fabio@cs.unibo.it>
To: legaldocml@lists.oasis-open.org
Date: Thu, 18 Oct 2012 13:22:09 +0200
Dear all, 

as promised in the conference call of October 10th, I finally managed to get a good look at the Chile debate example, and I would like to share some comments that I have about it. 

First of all, I would like to commend the thoroughness of the work. Every single chance of annotation has been exploited, and the result is an incredibly rich collection of facts and assertions over a 220 pages document. Of course, this being a draft, not all entities have been identified, but they have been spelt out, and this is commendable on all levels. Congratulations. 

The second thing I found interesting and useful is the systematic recourse to custom attributes such as bcn:classProbability, bcn:uriProbability and bcn:found, which I deduce are the output of an automatic algorithm that tries to interpret the nature and identity of entities (such as individuals, roles, organizations, etc.) mentioned in the text, providing an assessment of confidence in the results found. The purpose of this assessment, I gather, is to help a human drafter, in a later stage, to verify the uncertain attributions and correct the wrong ones. I also assume that, after this step is performed, such attributes are removed and never appear in the published document. 

Now to the list of the problems I found. Please do not assume that, since this list is longer than the list of good things, I did not like the example. I actually liked it A LOT, but explaining issues takes longer that congratulating for things good and innovative. 

The first thing one notices when examining the document is the zeal in capturing and expressing every detail of the markup. Sometimes such zeal is excessive. I will give you a few examples: 

--

  <speech by="#Ascencio">
    <from>El señor <person id="p8139" refersTo="#Ascencio">ASCENCIO</person> 
       (<role id="a132" refersTo="#cargo_322">Presidente</role>).- </from>
    <p>El señor Prosecretario va a dar lectura a la Cuenta. </p>
  </speech>

The speech element has already two attributes, "by" and "as", that are meant to identify the author of the speech itself and his/her role when speaking. For this reason, it is not necessary to identify in the <from> element the same information. The fragment can be simply rendered as: 

  <speech by="#Ascencio" as="#cargo322">
    <from>El señor ASCENCIO (Presidente).- </from>
    <p>El señor Prosecretario va a dar lectura a la Cuenta. </p>
  </speech>

--

  <debateSection id="ds9" name="Homenajes">
	<heading>VI. HOMENAJE</heading>
	<debateSection id="ds9-h1" name="Homenaje">
	  <heading>HOMENAJE EN MEMORIA DEL EX DIPUTADO DON <person id="p332" refersTo="#persona" bcn:classProbability="0.1" bcn:uriProbability="0.1" bcn:found="false">FÉLIX ERNESTO IGLESIAS</person> CORTÉS.</heading>
	  <p/>
	  <debateSection id="ds9-db1-part1" name="Participacion" refersTo="#Forni" bcn:uriRol="#cargo_1" bcn:uriTipoParticipacion="#Homenaje">
		<speech by="#Ascencio">
		  <from>  El señor <person id="p847" refersTo="#Ascencio" bcn:classProbability="0.9" bcn:uriProbability="0.98" bcn:found="true">ASCENCIO</person> (<role id="a127" refersTo="#cargo_322" bcn:classProbability="91" bcn:uriProbability="0.89" bcn:found="true">Presidente</role>).- </from>
		  <p>Corresponde rendir homenaje al ex diputado don Félix Ernesto Iglesias Cortés, recientemente fallecido. </p>
		  <p>  Se encuentran en la tribuna de honor la señora Silvia Castellanos viuda de Iglesias y los hijos, familiares y amigos de nuestro homenajeado, a quienes agradecemos su presencia.</p>
		  <p/>
		  <p>  (Aplausos).</p>
		  <p/>
		  <p>  Tiene la palabra el diputado señor Marcelo Forni. </p>
		  <p/>
		</speech>

I do not share the need to nest so many debateSections within each other. I think that a much simpler structure can be obtained with much less nesting: 

   <debateSection id="ds9" name="Homenajes">
    <num>VI. </num>
    <heading>HOMENAJE</heading>
    <subheading>HOMENAJE EN MEMORIA DEL EX DIPUTADO DON FÉLIX ERNESTO IGLESIAS CORTÉS.</subheading>
      <speech by="#Ascencio" as="#cargo_322">
        <from>El señor ASCENCIO (Presidente).-</from>
          <p>Corresponde rendir homenaje al ex diputado don <person id="p332" refersTo="#Iglesias">Félix 
             Ernesto Iglesias Cortés</person>, recientemente fallecido. </p>
          <p>Se encuentran en la tribuna de honor la señora Silvia Castellanos viuda de Iglesias y los hijos, 
             familiares y amigos de nuestro homenajeado, a quienes agradecemos su presencia.</p>
          <p><remark type="sceneDescription">(Aplausos)</remark>.</p>
          <p>Tiene la palabra el diputado señor Marcelo Forni. </p>
       </speech>

I agree that it is a matter of sensibility to decide whether a fragment is its own subsection or part of the current one, but I think here we have a bit exceeded in nesting. 

--

Elements outcome and vote are used in an improper way. I do not blame you, examples are scarce, and I think that your experience can help us create a much better grammar for debates. 

  <debateSection id="od01-pl01" name="ProyectoDeLey" bcn:uriProyectoLey="#p3556-15"
bcn:uriResultadoDebate="#SeAprueba"
bcn:uriTramiteConstitucional="#ITercer-Tramite-Camara-de-Origen"
bcn:uriTramiteReglamentario="#DiscusionParticular">
    <!--                         -->
    <!-- speeches                -->
    <!--                         -->
    <speech by="#Ascencio">
      <from>  El señor <person id="p358" refersTo="#Ascencio" bcn:classProbability="0.9"
bcn:uriProbability="0.98" bcn:found="true">ASCENCIO</person> (<role id="a194"
refersTo="#cargo_322" bcn:classProbability="91" bcn:uriProbability="0.89"
bcn:found="true">Presidente</role>).- Ofrezco la palabra.</from>
      <p>- Aprobadas.</p>
      <p/>
    </speech>
    <summary>
      <outcome refersTo="#aFavor">  
        -Votaron por la afirmativa los siguientes señores diputados:
        <vote by="#persona">
          <person id="p3583" refersTo="#Ascencio" bcn:classProbability="0.9"
bcn:uriProbability="0.98" bcn:found="true">Accorsi Opazo Enrique;</person>
        </vote>
        <!--                         -->
        <!-- 98 more vote statements -->
        <!--                         -->
      </outcome>
    </summary>
    <p/>
    <summary>
      <outcome refersTo="#seAbstiene">
        -Se abstuvieron los diputados señores:<eol/>
        <vote by="#persona">
          <person id="p3517" refersTo="#Ascencio" bcn:classProbability="0.9"
bcn:uriProbability="0.98" bcn:found="true">Galilea Vidaurre José Antonio;</person>
        </vote>
        <vote by="#persona">
          <person id="p3528" refersTo="#Ascencio" bcn:classProbability="0.9"
bcn:uriProbability="0.98" bcn:found="true">Kuschel Silva Carlos Ignacio.</person>
        </vote>
      </outcome>
    </summary>
  </debateSection>

The outcome should be used as an inline around the words that specifically express the outcome of the vote. Similarly, the vote element already contains attributes to specify the voter and the vote, as follows: 

  <debateSection id="od01-pl01" name="ProyectoDeLey">
    <heading>EXIGENCIA DE LICENCIA CLASE F PARA CONDUCIR VEHÍCULOS DE EMERGENCIA DE BOMBEROS. 
      <docStage refersTo="#Tercer-Tramite-Camara-de-Origen">Tercer trámite constitucional</docStage>.
    </heading>
    <!--                         -->
    <!-- speeches                -->
    <!--                         -->
    <speech by="#Ascencio" as="#cargo_322">
	  <from>El señor ASCENCIO (Presidente).- </from>
	  <p><outcome refersTo="#vote1">Aprobadas</outcome>.</p>
    </speech>
    <summary>
      -Votaron por la afirmativa los siguientes señores diputados:
      <vote by="#Accorsi" id="vt0001" choice="#afirmativo">Accorsi Opazo Enrique</vote>;
      <vote by="#Aguiló" id="vt0002" choice="#afirmativo">Aguiló Melo Sergio</vote>;
      <!--                         -->
      <!-- 98 more vote statements -->
      <!--                         -->
    </summary>
    <summary>
      -Se abstuvieron los diputados señores:
      <vote by="#Galilea" id="vt0087" choice="#abstención">Galilea Vidaurre José Antonio</vote>;
      <vote by="#Kuschel" id="vt0088" choice="#abstención">Kuschel Silva Carlos Ignacio</vote>.
    </summary>
  </debateSection>

--

The most evident and most delicate situation is the systematic use of new attributes belonging to a different namespace. Although this is done correctly and according to the guidelines for custom extensions to Akoma Ntoso, I can't help feeling that in many cases there was no real need for that, and that there is just ONE small modification in the Akoma Ntoso through which one can obtain basically everything that is needed here. I give a few examples: 

<analysis source="#bcn">
  <parliamentary>
    <voting id="v1" outcome="#aprobacionUnanime" href="#ct1-db1-vot1">
      <count id="v1-c1" value="0" refersTo="#SinConteo"/>
    </voting>
    ...

  <debateSection id="ct1-db1" name="Debate" bcn:uriResultadoDebate="#SeAprueba">
    <!-- speeches -->
    <summary>
      <outcome refersTo="#aprobacionUnanime">Acordado.</outcome>
    </summary>
  </debateSection>

The result of the debate should better handled through the <parliamentary> element, which you are actually using, but for some reasons need to re-express the same information again. A small modification in the use of the <outcome> element provides all the information you need. Instead of both the outcome element of the debate and the outcome attribute of the voting pointing to the TLCConcept "aprobacionUnanime", the refersTo attribute of the outcome element points to the voting element in the parliamentary element, providing all the necessary information to detail whether is was approved, and by how much. 

<analysis source="#bcn">
  <parliamentary>
    <voting id="v1" outcome="#aprobacionUnanime" href="#ct1-db1">
      <count id="v1-c1" value="0" refersTo="#SinConteo"/>
    </voting>
    ...

  <debateSection id="ct1-db1" name="Debate">
    <!-- speeches -->
    <summary>
      <outcome refersTo="#v1">Acordado.</outcome>
    </summary>
  </debateSection>

--

<debateSection id="ds8-db1-part1" name="Participacion" refersTo="#Urrutia" bcn:uriRol="#cargo_1" bcn:uriTipoParticipacion="#intervencion">
...
<debateSection id="od01-pl01" name="ProyectoDeLey" bcn:uriProyectoLey="#p3556-15" bcn:uriResultadoDebate="#SeAprueba" bcn:uriTramiteConstitucional="#ITercer-Tramite-Camara-de-Origen" bcn:uriTramiteReglamentario="#DiscusionParticular">


I fail to understand what is the role of many of these attributes (in particular uriRol and uriTipoParticipacion), but if the purpose is to add additional interpretation data to the section and its content, the right place is within the <analysis> section, probably in the <otherAnalysis> section, with metadata elements pointing to the debateSection via its id, as follows: 

  <otherAnalysis source="#bcnMetadata">
      <bcn:debateAnalysis>
        <bcn:section href="#ds8-db1-part1">
          <bcn:rol value="#cargo1"/>
          <bcn:tipoParticipacion value="#intervencion"/>
        </bcn:section>
        <bcn:section href="#od01-pl01">
          <bcn:uriProyectoLey value="#p3556-15"/>
          <bcn:uriResultadoDebate value="#SeAprueba"/>
          <bcn:uriTramiteConstitucional value="#ITercer-Tramite-Camara-de-Origen"/>
          <bcn:uriTramiteReglamentario="#DiscusionParticular"/>
        </bcn:section>
      </bcn:debateAnalysis>
  </otherAnalysis>
  ...
  <debateSection id="ds8-db1-part1" name="Participacion">
  ...
  <debateSection id="od01-pl01" name="ProyectoDeLey">

--

Funnily, you DO use elements in the otherAnalysis section for which a specific structure exists: 

    <bcn:MetadataBCN>
      <bcn:Materia id="met1" refParteDocumento="#ds1-pap1-com1" rdfLabelMateria="Renuncia" uriMateria="/recurso/materias/renuncia"/>
      <bcn:Materia refParteDocumento="#ds1-pap2-ws5" rdfLabelMateria="Cultura" uriMateria="/recurso/materias/cultura"/>
      <bcn:Materia refParteDocumento="#ds1-pap2-ws5" rdfLabelMateria="Patrimonio" uriMateria="/recurso/materias/patrimonio"/>
      <bcn:TerminosLibres refParteDocumento="#ds1-pap1-com1" valor="renuncia al cargo de Primer Vicepresidente"/>
      <bcn:TerminosLibres refParteDocumento="#ds1-pap2-ws5" valor="petroglifos norte"/>
      <bcn:TerminosLibres refParteDocumento="ds9-h1" valor="homenaje para EX DIPUTADO DON FÉLIX ERNESTO IGLESIAS "/>
      <bcn:AtributosDiarioSesiones bcn:uriResultadoSesion="#Exitosa"/>
    </bcn:MetadataBCN>

Except for the last item, both Materia and TerminosLibres are clearly keywords, for which a specific section in the document exists. The limitation to keywords is that so far they can only refer to the document as a whole, while you make them refer to a specific fragment only. 
Therefore, I suggest we add an optional href attribute to the keyword element, so that it becomes possible to add metadata pointing to individual fragments of the document, as follows: 

  <classification source="#bcn">
	<keyword href="#ds1-pap1-com1" showAs="Renuncia" value="/recurso/materias/renuncia" dictionary="#materias"/>
	<keyword href="#ds1-pap2-ws5" showAs="Cultura" value="/recurso/materias/cultura" dictionary="#materias"/>
	<keyword href="#ds1-pap2-ws5" showAs="Patrimonio" value="/recurso/materias/patrimonio" dictionary="#materias"/>
	<keyword href="#ds1-pap1-com1" value="renuncia al cargo de Primer Vicepresidente" showAs="" dictionary="#terminoslibres"/>
	<keyword href="#ds1-pap2-ws5" value="petroglifos norte" showAs="" dictionary="#terminoslibres"/>
	<keyword href="ds9-h1" value="homenaje para EX DIPUTADO DON FÉLIX ERNESTO IGLESIAS " showAs="" dictionary="#terminoslibres"/>
  </classification>

--

As a final note, let me point out that the automatic recognition algorithm mysteriously failed to recognize TOC block "XI. Otros documentos de la Cuenta" as a table of content fragment, so that the document starts with a strange debateSection full of "other" sections with no use. Also and really finally, I changed the roll call into a table, and added references for political parties, regions and districts of each member of the parliament mentioned in the roll call. Attached is a simplified version of the document reaching page 28 of the whole 220. 

I hope you appreciate my critiques, and that we can end up agreeing. 

Ciao

Fabio

--

Fabio Vitali                            Tiger got to hunt, bird got to fly,
Dept. of Computer Science        Man got to sit and wonder "Why, why, why?'
Univ. of Bologna  ITALY               Tiger got to sleep, bird got to land,
phone:  +39 051 2094872              Man got to tell himself he understand.
e-mail: fabio@cs.unibo.it         Kurt Vonnegut (1922-2007), "Cat's cradle"
http://vitali.web.cs.unibo.it/
Attachment: Sesion56_2_esquema3-simplified.xml
Description: application/xml
Follow-Ups:
- RE: [legaldocml] Chile's debate document
  - From: "Sifaqui, Christian" <csifaqui@bcn.cl>