RE: [legalxml-econtracts] Development steps for TC specification

Peter:

Very good starting list. My comments are below. Thanks again to Elkera for allowing us to use the BNML schema and to you for moving this along.

Rolly Chambers

-----Original Message-----
From: Peter Meyer [mailto:pmeyer@elkera.com.au]
Sent: Thursday, October 27, 2005 10:46 AM
To: Legalxml-Econtracts TC
Subject: [legalxml-econtracts] Development steps for TC specification - revised

Dear TC members,

At our last meeting, I was asked to provide an updated list of steps we need

to take to continue our development. This is to include the original list in

my email of 22 September, the issues raised by Rolly Chambers on 18 October

and any other issues that have arisen. I have made a few minor changes to

the previous issues for the purpose of clarification.

Shortly, I will be re-issuing the schema with some minor technical

corrections to deal with validation problems encountered by Rolly. This

should not affect consideration of the listed issues.

Elkera is preparing XSLTs to transform BNML markup into HTML / XHTML. A CSS

will be included. These will be released as open source around the end of

November. I will make them available as soon as they are sufficiently

complete.

1. Schema architecture

We need to decide the high level issues about the TC's proposed schema. Some

key issues are set out in the following points.

1.1 Is it a development platform or smorgasbord?

1.1.1 Background

The issue is whether to create a basic, development platform or a

smorgasbord schema that tries to cover all possibilities.

The BNML Schema provides a basic model that can be easily adapted by

particular users to their needs while retaining the core content models. The

idea is to not overload the schema with things that a lot of users may not

need but to provide all the basics so that applications developed around the

schema can be easily adapted to deal with changes. This is also the approach

taken by DITA, although it uses a different specialization model to that in

BNML.

Schema such as DocBook take the smorgasbord approach, but do allow

customization.

[RLC - I agree that less is more in our circumstances.]

1.1.2 PM recommendation

It is impracticable, particularly at this time to develop a universal schema

that will fully satisfy all needs. There is no identified need for such a

schema. The primary function of this schema at this stage will be in back

end systems (precedent systems and automated publishing on web sites. There

is no need for widespread exchange of eContracts documents between

enterprises since there is no infrastructure for recipients to deal with it.

As the need arises, particular interest groups who want to exchange data can

define standards for the purpose.

I recommend that we keep it as simple as possible and define a base model

that can be easily extended by particular users, along the lines of the BNML

approach. I don't think it is necessary for us to use the DITA approach to

specialization for contracts.

[RLC – I agree that publishing is an important function, that defining a simple and extensible base model is the ideal, and that the DITA specialization approach is not desirable. I disagree that publishing is the primary function – the OpenDocument XML format used by OpenOffice is well ahead of any publishing format we might come up with and is already being used successfully.

Besides publishing, the search and retrieval of key information items, information exchange, and compatibility with other XML formats are important functions. While we are focusing initially on the publishing function, we should remain mindful of other important functions. In focusing too closely on the publishing function, we risk making schema design decisions that will limit the schema’s usefulness for functions beyond publishing. ]

1.2 Do we include document types other than "contract"?

1.2.1 Background

You will notice in the eBay user agreement, I have added the privacy policy

as a "document" that is attached in the adjunct element. This adds

considerable flexibility to the schema but may or may not be necessary for

particular users, depending on the approach they take.

In theory, if particular users want to add "document" or other document

types, they can easily do so by borrowing from BNML or developing their own.

[RLC – we should focus on contract documents and expect that users will need or want to add other document types by borrowing from BNML or other XML vocabularies.]

1.2.2 PM recommendation

Adding the "document" document type or something similar makes the schema

very flexible and makes it easier for others to see how the schema can be

extended in other legal areas. The TC has an interest in the adoption of its

model by other legal and business users. Unless this occurs, the schema's

market penetration will be extremely limited.

On balance, I recommend that we include the "document" document type to

provide this flexibility.

[RLC – I agree with providing examples to illustrate how other XML document types can be incorporated in XML eContracts. The ability to extend an eContracts XML schema to allow this is important. However, the BNML “document” document type is only one of many possibilities and I do not think we should limit ourselves to BNML.]

1.3 Which schema syntaxes do we use and which is normative?

1.3.1 Background

We have a choice between Relax NG, XSD (XML Schema) or DTDs. We canvassed

the issues in 2003-2004. Due to the different capabilities of each, we may

need to select one syntax as the normative schema. This decision will likely

depend on the architecture of the schema. If we want to provide features

that are not supported by, say, DTDs, then we will need to use either Relax

NG or XSD schema and make it the normative version.

This does not prevent us from making the schema available in all 3 syntaxes

for the convenience of users. This is the approach taken in BNML, although a

full DTD has not been created at this time.

The BNML schema uses Relax NG compact syntax for several reasons:

* BNML re-defines the item element in the //block context. This cannot be

done in DTDs;

* Relax NG is extremely flexible and excellent for customization;

* There are tools to automatically generate XSD schema from Relax NG

* The Relax NG compact syntax is reasonably close to DTDs and is very human

readable. XSD schema are XML documents and very difficult to read without a

schema editing application.

[RLC – I agree with using RelaxNG or XML schema as the normative version instead of DTDs. XML schema currently are more widely supported compared to RelaxNG, but XML schema are more complicated. I can live with RelaxNG as the normative version, although tools also exist to automatically general RelaxNG schema from XML schema.]

1.3.2 PM recommendation

I believe we must provide a schema because some applications do not support

DTDs (MS Word).

The choice may depend on whether the TC retains the re-defined item in

//block in its schema. If it does, I recommend we use Relax NG compact

syntax as the normative schema syntax. Even without this, I believe it is

the most appropriate syntax.

[RLC – see preceding comment. Also, does MS Work support RelaxNG schema or only XML schema?]

1.4 Schema name and XML name space

1.4.1 Background

The TC does not have to use "BNML" and may not wish to do so. The TC can

decide on its own name space and schema name.

1.4.2 PM recommendation

Develop a new name for the eContracts schema.

[RLC – I agree with developing a new name and XML namespace for an eContracts schema. I also prefer not to use URLs for XML namespaces.]

1.5 Relationship between the schema and other schema that provide semantic

vocabularies, such as UBL.

1.5.1 Background

Rolly Chambers raised two issues:

(a) Can the BNML schema be used within these other schema?

(b) Can these other schema be included within BNML?

1.5.2 PM comment

I doubt very much that it will be useful to incorporate BNML into another

schema such as UBL, even if it can be done via name spaces. As I understand

it, UBL is not a document schema.

I see no reason why UBL or parts of it cannot be incorporated into BNML

contract. I am not familiar with UBL in any detail so I don't know all the

issues. It seems to make sense to use something like UBL if it provides a

ready made, useful vocabulary.

I believe we should leave it as an option for those who need it but not

burden anyone who does not.

[RLC – UBL does not provide a good way to mark up narrative contract provisions in a purchase order or invoice and is not intended to do so. This means that the proverbial “fine print” of a purchase order or invoice is not part of a UBL XML instance exchanged between a buyer and seller. Such routine narrative contract provisions as warranties, limitation of remedies, indemnification, dispute resolution methods, insurance obligations, force majeur, governing law, and similar provisions are missing from UBL contract documents. A simple XML “microformat,” such as the one we are considering, that can be incorporated into UBL to markup narrative contract provisions would be a good contribution. Even a simple structural, publishing oriented XML schema of the type we are considering will help. A more specialized XML vocabulary for eContracts might come later.

UBL provides a ready-made, useful XML vocabulary for marking up information about the contracting parties and their contact information, the issue date, the base price, and similar items. UBL also is developed by an OASIS TC and recently has offered a simplified “Small Business Subset” of the more comprehensive UBL XML vocabulary. We should be able to easily borrow basic semantic XML elements from the UBL XML vocabulary and avoid the chore of creating yet another XML version of “party,” “address,” “date,” etc.]

1.6 File organisation of the BNML Schema

1.6.1 Background

Rolly has pointed out that the use of multiple files in the Relax NG version

of the schema makes it difficult to understand.

It is difficult initially to see how the schema works, although the file

structure is set out in some detail in the Readme file that accompanies the

schema documents.

The arrangement is part of the schema's design to make it easy for users to

modify the schema. The core elements that are essential to BNML are in one

file "bnml-core.rnc". Under Elkera's licensing model, these elements cannot

be changed if the schema is to be called "BNML". Of course, the TC can take

a different approach with its implementation.

The file "bnml-structure.rnc" includes major container elements such as body

and back that may be shared in several different document types. There is a

separate file for metadata and for xinclude to permit these to be optionally

used or omitted. There is a separate file for each document type (contract,

document etc) for document type specific elements. Finally, there is the

file is the file "bnml-s-eContracts.rnc" umbrella file that defines the

eContracts schema. This file is the only one that a user will normally need

to modify to extend the schema.

It is very common in DTD or schema architectures to do this to maximise the

flexibility of the schema. The BNML set up is much simpler than for other

major DTDs. 1.6.2 PM recommendation

I recommend that we retain this basic structure. If everything is in one

file, customisation is much more difficult to manage.

We can easily improve the explanation in the ReadMe file, if necessary.

Only schema developers ought to be quite capable of working it out. They are

the only ones who need to refer to the schema files. Ordinary users should

never have to work with schema files. They should be able to gain necessary

information from the separate schema documentation.

[RLC – the workings of the separate BNML schema files has not been easy for me to figure out even after reading the associated commentary. For instance, I still do not have a good idea how to customize or tailor the BNML schema to incorporate UBL. I don’t oppose the approach of breaking up a schema in ways that make it easy to extend or modify – it will just take me some time to figure out the workings of BNML so that I can fairly evaluate its strengths and weaknesses.]

2. Detailed schema design and development

2.1 Review BNML element names

Do we want to change any of the element names for BNML elements adopted in

the schema?

[RLC – I would change at least some of the element names – “item” to “section” but leave “block/item” as-is to mark up items in a list; delete “text” or re-name it “line;” re-name “adjunct” to “attachment”.

Also, it looks like <Title> and <dc:title> are duplicative. I think we should settle on one or the other.]

2.2 Structures for contract front and parties markup

Do we need a more flexible or structured model for handling contract front

matter and parties?

2.3 Party signature markup

(a) Do we need the semantic party-signature markup?

(b) If so, is the model sufficiently flexible?

[RLC – see previous comments about borrowing “party” and “date” from UBL. I think it is a good idea to include party-signature markup. I haven’t compared the proposed party-signature markup to other possibilities and don’t yet have a view as to the flexibility of the current proposal.]

2.4 Inline lists

Do we need to markup in-line lists?

[RLC – we need to be able to markup items that can be published as in-line lists.]

2.5 Re-use of item as a list item element (//block/item)

(a) Rolly Chambers suggested that re-use of the item element may be

confusing to some.

[RLC – if the “item” container element is re-named to “section,” then it will not be confused with the “block/item” element used for list items. The confusion comes from using a single element name “item” for two different purposes and structures – one as a container and the second for list items within a block.]

(b) Do we need to add a list container and separate list item to provide for

separate numbering control on multiple lists in a single paragraph or block?

[RLC – I don’t think so.]

2.6 Reference schedules at front of contracts

Do we need to allow adjunct to occur at the front of the document for

reference schedules?

[RLC – I don’t think so. When the XML instance is published using stylesheets, schedules and attachments can be displayed at the front, at the back, or in the middle of the output document. Thus, their location within the XML instance is not particularly significant.]

2.7 Values for inclusion class attribute

Do we want to define specific values for the inclusion class attribute?

[RLC – not sure. What are the possible specific values – “schedule,” “attachment,” “exhibit,” etc.?]

2.8 Automatic numbering control attributes

Do we want to define specific values for these attributes on item, inclusion

and adjunct?

[RLC – I would prefer for auto-numbering to be handled by a stylesheet rather than as a content item in the XML document. My bias for radically simplifying the markup in the XML narrative document is showing.]

2.9 Additional elements

Do we need any other elements? Rolly has suggested we may need phrase level

and keyword elements in addition to those already provided. Perhaps the

mention element could be renamed to phrase.

[RLC – it would be useful to be able to markup phrases and keywords in an XML narrative contract because phrases and keywords are grammatical components of narrative documents.]

2.10 Choice of loose, standard or tight content models for elements using

//item and //block

Rolly has asked which model is the standard and whether we should use the

loose model at all.

The intention in BNML is that the loose model is the exchange standard but

that users would normally use the standard or tight models when creating

markup. The loose model is detrimental in contents creation and page

chunking on web sites. However, the loose model is useful to incorporate

quoted material in //inclusion. In this context it should be harmless.

[RLC – I prefer the tight model – only “item” elements (which I think of as “sections”) can occur within a “body” element. The loose model allows either block (i.e. paragraph) or item elements to occur within “body” which means paragraphs and sections can be interspersed. My bias for simplifying the markup in the XML narrative document is showing again – in this case by eliminating the choice to use either section (i.e. item) or paragraph (i.e. block) within body.

I am unclear what Peter means by saying “the loose model is the exchange standard” for BNML documents. I also am unclear about how the tight model impairs the incorporation of quoted material in //inclusion.]

2.11 ID & IDREF for linking

Rolly asked whether we should be using ID & IDREF.

The schema permits //reference to point to objects in other files, say where

a contract is to be assembled from shared /item elements. In that case, the

ID IDREF mechanism will not work unless the shared /item elements are fully

copied into the document.

[RLC – it is correct that ID-IDREF attributes for linking only concerns references within a single XML instance. If the XML narrative contract contains <Party id=”party_1”>R Chambers</Party>, then the content of this <Party> element can be referred to subsequently in the same XML document as <Party idref=”party_1”/>. Associations or “links” among XML elements also can be marked up using ID-IDREF attributes (e.g. <Email idref=”party_1”>rlchambers@smithcurrie.com</Email> “links” this content to the Party element containing my name).

I am envisioning that an assembled XML narrative contract document will include shared /item and other elements fully copied into the assembled XML document. ID-IDREF “linking” would work within the assembled XML document. Peter may be envisioning that the assembled contract will actually be the output of many separate XML files published as a single document by a stylesheet or some other means.

The possibility of dividing what is customarily the content of a single paper contract document into separate electronic XML files has come up in other contexts. I am more comfortable with the concept of collecting the content into a single XML file because it is more like collecting the content of a narrative contract in a single paper document. That said, I am open to considering alternatives and I acknowledge that even in the paper world “incorporation by reference” is used to bring separate paper documents together into a single contract.]

2.12 Metadata

Do we need to provide for any additional metadata structures or just leave

this to individual users? BNML currently provides basic Dublin Core metadata

on //contract and //item.

We can add further DC elements or we can use another metadata model or just

leave it entirely up to each user.

[RLC – I like BNML’s use of the Dublin Core metadata elements. I would add dc:type, dc:identifier, and dc:relation as additional optional metadata elements. We may find a need for a few more metadata elements not included in the Dublin Core set.]

2.13 Presentation markup

Do we need to remove elements and attributes considered to be presentation

information. These may include the em element and the align attributes on

block and text. Rolly also suggested that attributes to control the

application of automatic numbering may be in this category.

[RLC – I would also include the orient attribute. I think auto-numbering, orientation, width, height, alignment, and so forth can be adequately handled by stylesheets for publishing XML content. To keep the markup of an XML narrative contract as uncluttered and simple as possible, I’d prefer to leave these attributes out of the XML content markup and let stylesheets handle such display oriented items.]

2.14 Variables markup

Rolly has pointed out that additional markup may be needed for variables. In

BNML, the autovalue element is intended to be for substitution variables. We

can consider precisely what functionality we want and whether the existing

markup is adequate.

[RLC – I probably have not understood the BNML approach to markup of variables. Existing markup may indeed be adequate.]

3. Schema and specification documentation

Once we have decided all the above issues, we can prepare the actual

specification. This will need to include a comprehensive explanation of all

elements and attributes defined by the schema.

[RLC – I agree with the importance of providing clear and comprehensive schema documentation.]

Undoubtedly, there are other issues but this list should get us started.

Regards

Peter

------------------------------------------------------------------------

Elkera Pty Limited (ACN 092 447 428)

Email: pmeyer@elkera.com

Ph: +61 2 8440 6900 * Fax: +61 2 8440 6988

http://www.elkera.com

---------------------------------------------------------------------

To unsubscribe from this mail list, you must leave the OASIS TC that

generates this mail. You may a link to this group and all your TCs in OASIS

at:

https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php

legalxml-econtracts message