[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: RE: [legalxml-econtracts] Official XML Records
A lengthy reply follows, discussing (1) what is
"semantic markup" and what is not (2) conformance requirements
and validation (3) a law firm's own, private, markup of a legal
document and (4) looking ahead to an xml:names attribute, to XForms, and to the
impact on other OASIS dialects and on LegalXML work
products.
=========================================
Rolly,
I didn't define my term "semantic markup" -- it's all
the elements (the vocabulary) being defined by LegalXML TCs, including articles,
sections, clauses, document titles, captions, party names and addresses,
signature blocks, and so on. Non-semantic markup is that required for
presentation -- in the case of SVG, that includes drawing surfaces, fonts, text
strings, and scalable-vector-graphic and bit-mapped images. In the case of
XSL-FO, non-semantic markup includes page sequences, non/repeating page areas,
tables, lists, and text/image blocks. In the case of XHTML, non-semantic
markup includes the usual suspects - paragraphs, headings, block and inline
objects, tables, and lists.
Your point is well-taken, though, that there is a set
of semantic markup that should not ever be considered required for a conforming
LegalXML document. Personally, I feel that NO semantic markup should be
"required" in order for an XML datastream to be considered a conforming LegalXML
document -- all that would be required is merely that it be encoded
using any of the XML dialects that we designate as being a "presentation
dialect", eg XHTML, SVG, or XSL-FO -- and that its encoding abide by the
constraints that we would impose on the use of that dialect for legal
documents.
For instance, I would want to insist
that legal documents encoded in XHTML would not be allowed to contain (1)
<object>; (2) <script>; or (3) certain types of <link>;
elements. Similar constraints which foreclose modification of the document's
content would be designated for SVG and XSL-FO streams. And, of course,
conforming XML documents, regardless of its presentation dialect, would not
be allowed to contain an <?xml-stylesheet> programming instruction whose
type is anything BUT 'text/css'. Finally, I would be sure to ban all elements
and attributes which are from a namespace other than LegalXML's, or are in any
way not defined by a W3C specification -- precluding, for instance, Microsoft's
data-binding attributes on <input> elements.
The LegalXML namespace would therefore define a single,
global attribute, using XML Schema -- a "names" attribute -- and publish a
short specification of Do's and Don'ts for what constitutes a conforming
presentation document. Sure, an XSL-T stylesheet can be instantly written and
distributed that scans for taboo elements, attributes, and programming
instructions. The only job after that is to define what the contents of the
"names" attribute may contain -- still alot of work! As far as I know, LegalXML
would not define any XML elements, just the single global attribute, and a
vocabulary, that is, an ontology that can be extended by one's own law firm
-- thereby preserving computability of the vocabulary terms across software
products (via RDF inheritance).
The validation of this document would therefore proceed
normally -- against the DTD or XML Schema defined for the XHTML, SVG, or XSL-FO
dialects. The semantic validation would or can occur separately, as it should,
from the main business at hand: exchange of the presentation document that IS
the official record. A document that is semantically invalid is not legally
invalid, if it follows the other rules. Consider this approach to validation as
the "next step" in the evolution of XML document validation: from the
distinction between a "well-formed" XML document and a "valid" XML
document, the process now addresses whether a document is
"physically valid" and whether it is "semantically valid". Semantic validation
occurs using the ontology we'd have published.
This is an easily marketable, non-intrusive approach to
standards-making. It is one that does not require anyone to buy special software
to create a document which may then be deemed a "legal" XML document, opening
the door wide to many technical implementations of varying complexity. It is
intuitively graspable by technologists, attorneys, and judges, and it frees the
legal brethren in LegalXML from having to understand all the details of a
profession that is not their own - software engineering. Finally, this approach
leads straight towards establishing a javascript-like language that is
understandable to power-users in the legal profession; it simply leverages the
content of the "names" attribute.
The key is a global "names" attribute that
contains the name(s) one has assigned to the blocks, strings, and images
in the presentation document. To prevent one's "work product" from
being shared with another party would then be trivial --
just store one's assigned names in an attribute with a different
namespace than Legalxml's (for instance, the firm's own namespace). When
the document is "prepared" for transmission as a "legal document", then that
attribute would be naturally stripped out by an XSL-T stylesheet, for
example. The tool used to annotate the document would simply need to know what
namespace prefix to use for the attribute holding the name being assigned to the
block, string, or image -- is it a lgl:names, or is it myFirm:names? The
vocabulary relevant to the namespace-prefixed names attribute is pointed at
by the URI of the namespace, defined using the standard "xmlns" attribute....
This attribute COULD be defined by the W3C as an
XML attribute, containing "colonized" names, eg. <span
xml:names='lgl:Contract.ReferenceDate.date'>, or it could be defined by
LegalXML itself. Personally, I'd rather see xml:names, but I think that happy
result can come about only as a result of a clear decision by LegalXML that it
is necessary for the purposes of accommodating anticipated legal requirements
that eventually would be imposed by courts venturing into this arena.
Thanks,
John McClure
Hypergrove Engineering
PS Incidentally, this technique equally applies to
documents encoded using XForms, just as valid a presentation dialect as XHTML in
my opinion, it's just that it's not yet a W3C Technical Recommendation. Once it
is, I would heartily support adding XForms to the list of permissable dialects
for "official records". Also, note that this technique does have the effect of
precluding DocBook, UBL, Open Office, and other OASIS dialects as permissable
for official, legal records. They can, however, just as easily define their
own names attribute for presentation elements, thereby achieving
transformability to and from their dialects (assuming that they define no
attributes of consequence however) - this would accommodate the wealth
of tools created for those dialects.
Thanks,
<div
nttp:names='Posting.Author.FullName.en'>John
McClure</div>
<div
nttp:names='Posting.Author.Company.Name.en'>Hypergrove
Engineering</div>
<div
nttp:names='Posting.Author.Company.anyURI'>http://www.hypergrove.com</div>
|
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]