legalxml-econtracts message

Subject: Re: [legalxml-econtracts] FW: J McClure RDF proposal for structural markup
From: "jmessing" <jmessing@law-on-line.com>
To: Legalxml-Econtracts TC <legalxml-econtracts@lists.oasis-open.org>, <pmeyer@elkera.com.au>
Date: Thu, 23 Oct 2003 20:19:25 -0400
My own feeling is that we should try to keep in synch with other TC's of OASIS and position ourselves within the growing community of web services, as much as possible, to avoid becoming the TC's work from becoming either too visionary or completely irrelevant.

To this end, I think RDF, while it may ultimately prove useful because of the associations it allows, is presently too far out of the mainstream for consideration in a version 1 econtracts spec.

I did also want to point out that John Boyer of PureEdge, a noted XML security expert, has expressed the view that HTML and XHTML are poor environments for XML that needs to be secure because of the variations that are possible between OS and browsers such that what is seen in one browser is not what may be seen in another, even though the markup is identical, leading to a security hole in any document presented for signatures.

I think at this stage, working with the fundamentals of mark-up and structures may be more profitable than any other expenditure of energy and I hope the TC will continue on with the agenda and schedule it previously set.

My 2 cents worth.

Best regards.

---------- Original Message ----------------------------------
From: Peter Meyer <pmeyer@elkera.com.au>
Reply-To: pmeyer@elkera.com.au
Date:  Fri, 24 Oct 2003 10:03:45 +1000

>Dear TC members,
>
>Earlier in the week I posted a message to John and the TC leadership group
>regarding John's RDF proposal for the structural markup. Daniel Greenwood
>asked me to post it to the TC list. I also think this is necessary to
>clarify some points that may be lost in the discussion:
>
>1. I do not oppose the use of RDF per se. I oppose its use for the core
>structural markup because it is highly inconvenient for authors to create
>and unnecessary at that level.
>
>2. I have not and do not in any way oppose support for linking. Clearly it
>will be required. I don't quite know what that support requires or means but
>that is another question. RDF may or may not be part of the solution. I
>don't have any preferences in that regard. The Harrop/Meyer structural
>markup proposals deliberately did not address the issue as part of
>structural markup. We defined structural markup as a basic layer of markup
>that is needed whether or not you are interested in adding linking
>facilities or many other markup requirements. Its the simplest skeleton to
>provide a platform for everything else. Linking was not part of the
>requirements adopted by the TC for structural markup many months ago. John
>now appears to be saying that the structural markup must use RDF in order to
>support linking.
>
>3. The issue to be determined at this point is not whether RDF is a good
>thing for linking and other purposes but whether it is needed or even useful
>in the basic structural markup of clause level objects in terms of the
>clause model requirements. Separate discussions on the list attempt to
>understand John's assertions and address this question.
>
>Regards
>Peter Meyer
>
>
>-------------------------------------------
>Original message sent on 22 October.
>-------------------------------------------
>
>Dear John,
>
>Last week Jason proposed that we should resolve the issue of RDF or not RDF
>for the structural clause model before moving to resolve the comparatively
>minor differences between us in the Harrop/Meyer clause model proposal.
>
>Personally, I don't claim to be an expert on RDF so I asked one of our
>developers who has done some work with it to review the materials you
>submitted and advise me on the merits of the proposal so I could better
>appreciate the issues involved. After also reviewing your materials, the
>following is our findings (written by me in a rather non technical way).
>
>
>1. In the clause model requirements, we broke the process down into stages
>in an attempt to try to focus on discrete issues at each stage so that we
>could try to reach a consensus on each point before moving to the next. It
>is extremely difficult to discuss proposals that attempt to cover a lot of
>issues because its hard to pick out the good from the not so good. I think
>that is part of the issue with your proposal. Your objectives regarding
>linking and so forth are fine but its trying to impose an approach that does
>not fit the overall problem.
>
>
>2. The objectives of structural markup as we perceive it are:
>(a) Define the most basic markup that supports the widest needs without
>burdening the majority of users with markup that they do not need.
>Structural markup as we propose, on its own, will support all the basic
>authoring, data management and publishing needs of most users. Most will
>need to add only a very modest amount of additional metadata to support
>quite powerful automated processes. It is also worth noting that the
>structural markup is intended to be generic for a wide range of legal and
>business documents so that common systems can be used, rather than separate
>systems for each of a range of DTDs or Schema.
>
>(b) Markup content so that the author's conceptual model for the document
>can be rendered accurately in print and online.
>
>(c) Define basic objects that can be easily reused (in a markup sense) in
>different contexts regardless of the foibles of the authors. In this
>respect, we are trying to develop a model that allows authors to flexibly
>create their content but in a way that maintains simple and consistent
>markup structures.
>
>(d) Define objects onto which additional semantic or other information
>(metadata) can be added to support more sophisticated
>processing, where its required. The structural model should be a foundation
>for all the uses you envisage and more.
>
>
>3. To achieve these objectives:
>(a) The clause model defines a small number of objects that can be used in
>different contexts to clearly define common grammatical constructs such as
>clause, subclause, paragraph etc. Even though these terms are fuzzy, the
>structure we have used in the clause model requirements are clear and that
>model demonstrates the basic concepts that are important.
>
>(b) It relies on the strict hierarchical and sequential relationship of
>these objects to determine structure. Except, possibly, in some edge cases
>which are not relevant to the initial proposal, nothing more needs to be
>added to the markup to determine the generic document structure.
>
>(c) It relies on fairly strict relationship between objects to limit the
>different ways authors may achieve a particular desired outcome. [Just how
>strict this needs to be is a matter to be resolved.]
>
>The key point is that an author can insert a few elements and move them
>around the draft document without renaming them and without adding any other
>information. They can create a highly structured document from which a human
>and a processing system can infer the generic structure of the document
>using whatever terminology of clauses, subclauses, paragraphs, lists etc
>they choose. You don't need to call it a clause to know its a clause in a
>contract document. However, if it is important to someone to actually call
>it a clause rather than an article or something else, that information could
>be added via metadata. That is beyond the scope of the initial proposal but
>I am sure it needs to be addressed in a later stage.
>
>
>4. We expect specialist content writers who create standard form documents
>and templates or precedents to create documents in XML. We hope that others
>will also the standard in their document authoring if its benefits are to be
>fully realised. To overcome the current barriers to the use of XML we have
>to create a DTD or Schema that is REALLY simple, flexible and reliable. It
>has to be easy for authors and provide the consistent structure needed by
>developers.
>
>
>5. We believe that your proposal confuses the issue of what is structural
>markup and its function. It mixes a loose element model and a metadata model
>at all levels of the document. As far as we can see, without the metadata,
>the collection of objects in the document would not tell a human reader or a
>processing system the information they require to know its generic
>structure. You would not know which objects are intended to be clauses and
>are to be numbered etc. The metadata is hardly optional. The problem this
>raises is discussed in more detail later.
>
>
>6. We believe that the metadata should be separated from the basic
>structural markup so that the application does not burden users with markup
>they don't require. See point 2(a) above.
>
>
>7. I am also confused about what you really see as the basic authoring
>environment for XML in law firms etc. You appear to say that XHTML will be
>adequate and that the structural model is unnecessary. If so, why propose
>another model? You may be right that many will not bother with a structural
>model. We do not oppose the use of XHTML if that is what people want to use.
>We agree that the TC should develop a semantic layer that can be mapped to
>XHTML if that is what is considered necessary.
>>From our perspective, we believe that a model similar to XHTML will not meet
>the needs of **many** users. We believe that a simpler, stricter, structural
>model is needed to make the task easier for authors and developers.
>
>
>8. It is somewhat difficult to know exactly how your proposed model should
>work. The markup approaches taken in your submission on 9 July and in your
>more recent submission are difficult to reconcile. There seem to be multiple
>ways of associating captions with the text to which they relate. Your model
>creates containers and it allows them to be arranged hierarchically but they
>can also appear in any order, repeatedly, at the top level. Captions can
>float around almost anywhere. It seems to us that an author **could** easily
>create a structure that completely avoids representation of the true
>hierarchical structure of the document and would make the data no more
>useful than HTML.
>
>
>9. Further, there is no way of knowing what function a block is to perform.
>To know what these containers are, one has to add rdf:type elements to
>blocks everywhere. Users can effectively call any object anything they like
>from the dictionary. We believe this will completely frustrate document
>authoring. This will occur at two levels:
>
>(a) The proposal throws all the burden for producing consistent data onto
>the drafting application. You say: "...its relatively foolish to pretend
>that the vast majority of attorneys will ever want to enter XML directly --
>they'll use products that'll choose the correct XML elements for them, that
>will then write the markup for them."
>This is a massive oversimplification of the problem. None of us is expecting
>or advocating that anyone should enter XML markup directly, as in a text
>editor. In our experience, it is possible to provide a very easy to use
>interface for XML authoring so that it can be significantly simpler than
>using a word processor such as Word for most operations.
>Under your model, it will be necessary for the drafting interface to define
>the type of document you want to create so that the user can pick objects
>and apply the correct rdf:type values. In effect, the drafting interface
>becomes the Schema. This potentially means that the interface is much more
>complicated and less flexible than under a true structural model.
>
>(b) Unfortunately, the initial, sequential authoring is only the start of
>most document creation. Authors need to re-order content both in sequence
>and in hierarchical relationship as they edit their work. As soon as this
>happens, the author comes face to face with the markup. The application can
>help with common operations but it gets harder to shield the author. We have
>not seen any XML authoring application which can avoid this with any kind of
>structural model. Your RDF proposal will make it extremely difficult to edit
>content because many such operations will require changes to the metadata.
>We know of no XML authoring tools that could provide authors with a simple
>way of doing this based on the RDF syntax. In other words, there is no
>current application support for your approach. RDF authoring tools that are
>available are not designed as general content editors that could be used for
>preparation of contracts or other office documents.
>
>
>10. Most users cannot consistently tell you whether an object is a clause, a
>subclause a paragraph a list item or whatever. What's more, they don't care.
>This was explained by reference to the example in the clause model
>requirements. As a general proposition, there is no reason to ask them to
>use these terms in the markup. The hierarchical and sequential relationship
>of objects is all that is needed for basic processing. It would not matter
>if item was changed to "gorp", it would work just as well. Its just that
>"item" is not as ugly a word as gorp and it distinguishes the element from
>block or para (whichever is used).
>
>
>11. Despite your assertion to the contrary, the Harrop/Meyer item model
>avoids presentation information entirely in its basic element structure.
>Component numbering is not presentational and cannot be equated with
>pagination. Component numbering is part of the content and part of the
>structure. Its omission from your model is a serious problem. It is not
>realistic to expect a sender and receiver's applications to generate the
>same numbering in all but the simplest of cases. It is not a scaleable
>approach to build a model on the assumption this will work.
>
>
>12. Now to the real problem. You promote the RDF model as something that is
>the way of the future for document markup. We do not believe there is any
>evidence of this. Fundamentally, we do not see how the use of RDF for
>document markup is within the anticipated use of the standard. RDF (Resource
>Description Framework) is intended to describe information objects,
>documents, people & organisations etc. Its about marking up data not
>narrative text content. The use of bag, seq and alt is consistent with this
>approach. They provide a way of typing data that is not particularly helpful
>when marking up text content.
>
>The RDF standard states:
>	"The development of RDF has been motivated by the following uses, among
>others:
>	* Web metadata: providing information about Web resources and the systems
>that use them (e.g.
>	  content rating, capability descriptions, privacy preferences, etc.)
>	* Applications that require open rather than constrained information models
>(e.g. scheduling
>	  activities, describing organizational processes, annotation of Web
>resources, etc.)
>	* To do for machine processable information (application data) what the
>World Wide Web has
>	  done for hypertext: to allow data to be processed outside the particular
>environment in which it
>	  was created, in a fashion that can work at Internet scale.
>	* Interworking among applications: combining data from several applications
>to arrive at new
>	  information.
>	* Automated processing of Web information by software agents: the Web is
>moving from having
>	  just human-readable information to being a world-wide network of
>cooperating processes. RDF
>	  provides a world-wide lingua franca for these processes."
>		Source: http://www.w3.org/TR/rdf-concepts/#section-motivation
>
>The overwhelming flavour of all this is data markup, not document content
>markup.
>RDF should have a long and fruitful life as a data description language.
>
>
>13. The item based structural model is not intended to be a complete model
>at this stage. The process clearly laid out in the clause model requirements
>is to separate issues and to concentrate at first on the basic structural
>model for the data and to then deal with other issues. It is clear that much
>needs to be added to make it complete. One of those things is metadata
>discussed in point 2 above. It may well be the case that RDF will provide a
>very good basis for metadata structures in the model. We see no reason why
>it should not be added (This will require full namespace support so will
>necessitate use of Schema or a very complex DTD implementation). However,
>this is an issue yet to be addressed.
>
>
>14. You have raised the issue of linking as one of the justifications for
>use of RDF. Linking is out of scope for the structural markup requirements.
>There are several approaches that may be taken to linking. These should be
>considered in due course. There is no reason why your requirements should
>not be included at that stage. Fundamentally, there is no reason why any
>desired linking functionality cannot be provided without using RDF for core
>structural markup.
>
>
>15. In summary, we believe the RDF proposal is not the way of the future:
>(a) Its an inappropriate use of the RDF standard. It was never envisaged
>that the standard would be used to markup document content in this way. The
>RDF approach offers no advantages for basic structural markup. There is no
>reason to believe that developers will support your approach to RDF markup
>in their document authoring applications. Without such support no one will
>use the model. By contrast, the structural approach of the Harrop/Meyer
>approach is tried and true.
>
>(b) It adds only complexity to a simple problem. It mingles metadata and
>structural concepts in a way that burdens authors and application
>developers. The proposal clearly fails under requirement 8.
>
>(c) It is doubtful that it will produce sufficiently consistent markup that
>it will meet the needs of application developers who need markup of distinct
>components for content re-use and sharing. There are too many ways different
>authors can markup the same content. It is seriously doubtful that it will
>satisfy requirements 2 and 9.
>
>
>Regards
>Peter Meyer
>
>
>---------------------------------------------------------------------------
>Elkera Pty Limited (ACN 092 447 428) - Knowledge management
>Email: pmeyer@elkera.com.au
>Ph: +61 2 8440 6900 * Fax: +61 2 8440 6988
>http://www.elkera.com.au
>
>
>To unsubscribe from this mailing list (and be removed from the roster of the OASIS TC), go to http://www.oasis-open.org/apps/org/workgroup/legalxml-econtracts/members/leave_workgroup.php.
>
>