RE: [legalxml-courtfiling] smart docs (was (Microsoft XML Team's WebLog)

Thanks, Jim, for your comments. Yes, currently almost everything that is filed in a court case could be said to be a “form” in some respects. Just the basic caption structure is a kind of form:

IN THE SUPERIOR COURT OF THE COUNTY OF KING

John Smith

Case # 12345

Motion to Dismiss and

Joe Doaks Imposition of Sanctions

The above has these data elements, at least:

--Plaintiff name

--Defendant name

--Case title/caption (John Smith V. Joe Doaks)

--Court

--County

--Case Number

--Document Title/Description

Somewhere there will be a signature block with the name, Bar number, and perhaps other data about the attorney…ever more routine data elements that attorneys enter now – that is, not additional work shifted from the Clerk to them!

Other data elements can be picked up without attorney effort or clerk re-entry: Date/Time stamps, Document Type Code (if any), Case Type (usually indicated in case number), Letterhead Information/Contact Information that comes with the firm template.

Most courts have strict format requirements relating to how to set up pleadings and other filings. These address more than just margins and structure of captions. In our state, for example, a Judgment must have a “Judgment Summary” on the first page. This was done to 1) summarize the judgment in one place, 2) give the Clerk one place to look for data elements needed for the state judgment tracking program (rather than require the data-entry clerk to read the whole document, try to understand and INTERPRET (a no-no!) it in order to figure out what data to enter. The attorney gets a positive benefit from “obeying” this rule – the information is in one place and much confusion is eliminated for all concerned.

Our state has two kinds of pattern forms: 1) mandatory and 2) optional/for your convenience. Both of those are amenable to hidden tagging: Put the data element into the box where it so indicates and the word processing or other application on which it’s based will apply the appropriate tags. These data element entry points could occur anywhere in the document – designers of the “form” (really, a semi-form) would have the task of working out what is best for all concerned. In between there can be pages and pages of free-form text and discussion and argumentation and citation, etc. While such material is also full of “data elements,” none would be tagged because the elements really needed by the court for automation and data entry/tracking purposes would have been called out for special attention.

Another level of this business is that XML-tagged legal documents give the law firm the opportunity to create its own XML schemas and dictionaries, for they could use XML to track dates, amounts, client files, authorship, typists, and whatever other data elements the law firm routinely uses! No one, except perhaps here, in our envisioning the potential released by E-Filing systems, has clearly seen that the attorney/firm is better off when needed data is called out for tagging – without interfering with the attorney’s ability to write as she will in order to persuade the jury or court. The law firm’s own tags (uniquely theirs) would be meaningless to the court and the clerk’s systems would ignore them.

The idea of whipping out one’s word processor and just free-form writing the persuasive argument to win the case is an interesting one – but that would be no different at all from pulling one’s typewriter out! What will be filed but a printout from the word-processor or typewriter? So long as normal format is observed, people know what to expect. However, there would be no possibility of automated processing of the document as it flows into the DMS and as it is referenced in the course of docketing and handling of the case. (Some have dreamed of super-smart OCR/ICR software that could review such printouts of originally-digital documents reduced to a printout – picture of a page displayed on a piece of paper – and, in a kind of reverse parsing process, convert text back to digital and then recognize data elements needed for docketing and “automatically” pull them out and enter them in target systems. It’s a nice dream, but any OCR/ICR system is imperfect enough that a person needs to double-check how the software interpreted the pixels it analyzed from the paper.)

I have not advocated for adding to the number of data elements that need to be tagged in order for everyone involved to do their respective jobs. Some have imagined that as a natural consequence of using tagging, seeing clerks as data-hungry tag-makers, and then have opposed data tagging because of their imagined consequence. Attorneys are used to following forms and formats when preparing materials for the case file. This technology ought to make it easier on them. (Indeed, within a given case, couldn’t the tagged key data elements become part of the templates made available for use by authors of briefs and pleadings in the case?)

I have been imagining here a scenario for “how things could work,” so I offer this as one of the business cases that the Documents subcommittee would do well to analyze.

Sincerely,

Roger

Roger Winters

Program and Project Manager

and

Continuing Legal Education (CLE) Coordinator

King County

Department of Judicial Administration

516 Third Ave. E-609 MS:KCC-JA-0609

Seattle, WA 98104

V: (206) 296-7838

F: (206) 296-0906

roger.winters@metrokc.gov

From: Jim Harris [mailto:jharris@ncsc.dni.us]
Sent: Thursday, January 18, 2007 6:16 AM
To: Winters, Roger; 'John Messing'
Cc: legalxml-courtfiling@lists.oasis-open.org; 'O'Brien,Robert'; 'Hickman,Brian'
Subject: RE: [legalxml-courtfiling] smart docs (was (Microsoft XML Team's WebLog) : Mixing structured and unstructured content in MS Word)

Roger:

You have a unique ability to break this issue down and relate it to the real-world problems we are trying to address. Your explanation highlights the primary objectives of the standards we are asking the justice and legal communities to adopt. I think the key part of this (which started this whole “smart documents” dialog) is the challenge associated with creating forms in a way that is as unobtrusive and intuitive as possible.

I interpreted (perhaps incorrectly) from one of John’s comments that attorneys would like to be able to draft documents as they do now without any concern to form or tagging of data. This suggests that electronic systems must be able to parse and interpret the content in order to identify relevant metadata that we all agree is essential to realize the benefits of electronic filing. That level of intelligence is just not feasible (at least with today’s technology) without the data being tagged in some way. Hence the need for some type of “form” so that data can be tagged as (or after) it’s collected.

Aren’t forms routinely a part of filing, even in the paper world? Why should it be any different in the electronic world? Ultimately, the systems and vendors who support attorneys and law firms need to put in place mechanisms to support the standards in a way that leverages use of their systems to feed the electronic filing process. That will most certainly involve some sort of form or process to identify and tag information. Further, electronic court policy should be part of that process so that those systems know what data must be tagged for that particular type of filing/document in the court to which they are filing.

Jim

-----Original Message-----
From: Winters, Roger [mailto:Roger.Winters@METROKC.GOV]
Sent: Wednesday, January 17, 2007 12:58 PM
To: John Messing
Cc: legalxml-courtfiling@lists.oasis-open.org; O'Brien,Robert; Hickman,Brian
Subject: RE: [legalxml-courtfiling] smart docs (was (Microsoft XML Team's WebLog) : Mixing structured and unstructured content in MS Word)

Dear John:

Thank you for this reply and, yes, it is very helpful.

I understand the importance within a court environment of having common

references in documents so court proceedings can avoid delays and

confusion while everyone compares different printouts of the same

information. I agree with you that PDF seems to have conquered that

problem better than others (so far) and it is certainly clear that PDF

has credibility with many of our stakeholders.

In my presentations about the idea of "smart documents" and "automating

docketing," I have shared the idea that electronic documents must be

understood as existing in several "layers" (for lack of a better term):

The "Top" one is the "human-readable," and this is where it is important

that the page breaks, etc., should all be the same. Any electronic

document in a court file, to be useful, must offer this human readable

surface that looks like "words on paper." It is the resemblance to paper

that is to be preserved, not necessarily all of the concepts and

practices we have built up over years of working with hard copy. The

consumers of documents at this level would not necessarily even know

that any XML markup has been done.

The "Next" level is the "XML markup" or "software-readable" level. This

could also be human-readable (although not formatted for conventional

reading) because the data and information in the document remain an

exact match for the "human readable" version (above). The XML tags

(labels) used must be the same if there is to be standardization,

interoperability, and the potential to leverage XML powers for other

uses besides docketing filings for court cases. I see the work that has

produced NIEM and GJXDM and such as directed to this "layer." Here, a

"data element tag" must fit a standard (or be a standard extension).

Often, the local term used may vary from the term used in the XML tag:

this is fine so long as the "thing described" is the same in each place.

The ideal regarding XML markup would be for the typical author of a

document to be filed in court not to be aware of XML markup. Here is

where I would think of any filed document having to be, necessarily, set

up as a form in some, but not all, ways. Where a court needs a data

element to be routinely marked up, it would include a spot in the

"pattern form" where that data element needs to be placed (example: the

case number always appears at a certain place in the first page

caption). Unseen to the user would be the process of associating the

appropriate XML data element tag with the location in the form where the

respective data elements get their markup. The "software-readable" layer

of the document interacts with applications searching for specified data

element tags so the tagged information can be re-used in targets where

it is needed (such as a CMS, DMS, calendar, etc.).

Other "layers" of the electronic document can be identified and

discussed, as well: there's a machine-readable layer, I presume, and

even the "envelope" might be seen as a "layer" where the metadata and

other information necessary for the Electronic Court Filing action can

be located and activated.

This "layering" seemed to me to solve a number of problems. A document

can be seen as both something for human beings to read, from which they

would be persuaded through argumentation, legal citation, etc., of what

is fact, what is law, etc., via the prose created by the

(litigant/attorney) author. At the same time, the document could be

parsed by a software application that searches out the appropriate,

needed data element tags that point to data that will be used in

automated docketing, data re-use, etc. Just as one might check out a

court file and read it years later, software might still be able to

locate needed data elements for years to come by applying the standard

in effect at the time of filing.

If this line of reasoning is off base or, worse, grounded in

impossibilities, it is important to learn that now. Otherwise, I would

like to see more people thinking about what it would take to support

automated docketing, ultimately yielding substantial savings (reduced

human data entry, reduced errors, etc.) and potential for improved

performance (e.g., quicker access to needed information for the court

and others).

I am finding this dialogue to be helpful and I hope my comments are

contributing to a constructive resolution which should show us, I

believe, that we do not have to force a choice among different

approaches.

I'll look forward to your and others' reactions to this essay o' mine.

Regards,

Roger

Roger Winters

Program and Project Manager

King County

Department of Judicial Administration

516 Third Ave. E-609 MS:KCC-JA-0609

Seattle, WA 98104

V: (206) 296-7838

F: (206) 296-0906

roger.winters@metrokc.gov

-----Original Message-----

From: John Messing [mailto:jmessing@law-on-line.com]

Sent: Saturday, January 13, 2007 6:33 AM

To: Winters, Roger

Cc: legalxml-courtfiling@lists.oasis-open.org; O'Brien,Robert;

Hickman,Brian

Subject: RE: [legalxml-courtfiling] smart docs (was (Microsoft XML

Team's WebLog) : Mixing structured and unstructured content in MS Word)

Roger:

I think many of us are groping our way through the issues, with

different levels of technical understanding and familiarity with

jargon. Frustration with my own lack of understanding is quite familar

to me, I can assure you.

My take on where we are is this:

ECF has developed tools to send documents between places in order to

file them. These documents have been PDF documents for the most part

that are "dumb" compared to the XML that carries them.

The mortgage industry and the land recorders have been working on the

same problem and have sophisticated tools that are different from ECF

to accomplish many of the same purposes, also using PDF's and XML

documents.

NIEM is a governmental-sponsored effort for the same transporting

purpose but at a greater level of sophistication because the tagged

terms can be associated with other tagged terms, combining the power of

them, using a technology called RDF, and with other powerful features.

I believe it is useful to settle on one basic set of building blocks,

and perhaps NIEM should be it for transport purposes. ECF seems to take

that approach. So does Enotary, at least in principle.

That still leaves the problem you have wisely directed attention to,

which is having the documents act in a smart fashion so that the

content of the PDF's doesn't have to be extracted and the data entered

by hand when they arrive at the destination.

The electronic document vendors have not been idle and have retooled

their products to make them "smart"; this is done by making the

products out of XML, instead of the proprietary formats previously used

and then displaying them using the legacy proprietary programs. This has

required turning the programs inside out, in a manner of speaking. One

consequence is that they can natively be incorporated by XML schemas,

or can communicate with the schemas, to accomplish goals without having

the awkwardness ECF has become accustomed to, for example, of embedding

the PDF's in XML. Brian just reported on how Microsoft has retooled

office products, including MS-Word to accomplish this purpose. Adobe

has been doing the same thing.

Adobe has two advantages over MS as I see it, apart from document

security features, which are not germane to this discussion.

1. The current XML offerings can be displayed by the Adobe Reader and

Acrobat programs in both the old and new formats, making the Reader and

Acrobat programs backwards compatible. The new Office products may not

be fully backwards compatible according to what I have read.

2. Adobe PDF documents preserve pagination regardless of screen

resolution. MS-Word has not up until now. That means multiple copies of

an MS document may show different pagination on different computers even

though the same file is displayed on all of them. This shortcoming makes

discussion of a document's content, as in a court hearing with multiple

attorneys, a judge and a clerk, very difficult, because passages will

not necessarily appear on the same page as displayed on the computer

even though the identical file is used. PDF does not suffer from this

limitation.

With regard to your question about eNotary, I think the short answer is

"none of the above." The membership of enotary is largely from the

mortgage and land recording groups who are trying to solve a bottleneck

in their workflow represented by enotarization. They are confronted by

many of the same issues but until now have been solving them in

different ways from ECF. They seem to have much more knowledge about

the PDF document aspects, in large part through John Jones, who is a

representative from the land recorder industry to enotary. He works

closely with top Adobe engineers.

I think each group, ecf and enotary, can learn much from each other

about the practical solutions to common problems that cut across legal

domains, provided there is a willingness to do so, which may require

letting go of legacy notions of turf.

I hope this is helpful.

> -------- Original Message --------

> Subject: RE: [legalxml-courtfiling] FW: (Microsoft XML Team's WebLog)

> Mixing structured and unstructured content in MS Word

> From: "Winters, Roger" <Roger.Winters@METROKC.GOV>

> Date: Fri, January 12, 2007 5:40 pm

> To: "John Messing" <jmessing@law-on-line.com>, "Hickman,Brian"

> <Brian.Hickman@wolterskluwer.com>

> Cc: <legalxml-courtfiling@lists.oasis-open.org>, "O'Brien,Robert"

> <Robert.OBrien@cas-satj.gc.ca>

> TO: All

> It will be interesting to explore these possibilities. I have trouble

> deciphering acronyms I've never heard of and I keep hoping that

sometime

> they will come to light and the basic ideas will be expressed so I can

> understand them. What are the pros and cons of one approach vs.

another?

> What are the business consequences or assumptions behind using one or

> another? For example, is the idea about using this in E-Notarization

> based on assumptions about notaries, attorneys, people in general, or

> it coming entirely from scientific technical reasons that can't be

> controversial? Are there religious issues behind the approaches?

(e.g.,

> in Notarization is there a religious conservatism that calls for

things

> to resemble "traditional" approaches, or is that not an issue?) Will

> this flavor of PDF work until Adobe clamps down in some future time

and

> takes free PDF reading away? A zillion questions come to mind for

those

> of us who are not adept at acronym-eze or tech-speak.

> I don't ask these questions seeking specific answers or defenses of

> positions - I am too ignorant of all of this to have a position yet. I

> ask them only to illustrate that we continue to have among us all a

> language barrier. It does not seem to be a problem for those who are

> technically advanced that acronyms spill from their tongues as they

> enthrall other technically advanced folks with brilliant new

> possibilities. It is a problem when those of us who would love to

> understand the possibilities find ourselves hopelessly lost because

they

> seem only to be speaking in tongues in which we have no experience. It

> is not at all insulting to "dumb it down" for others.

> That raises another consideration - if ECFTC does not make a certain

> attainment of technical expertise a requirement for participation, is

> therefore a requirement that the rest engage in "dumbing it down" for

> the rest? Who must take pains to bridge the communication gap? And it

> a gap and there is pain involved in trying to guess, beg for answers,

> preach against acronyms, etc. How can we work together if we do not

find

> the bridges and translations needed to understand one another? Could

XML

> come to our rescue by some "schema" (still not sure I can explain that

> idea to others) that processes the terms somehow so that a "dummies"

> version is generated as well?

> Did I miss the prerequisite classes that everyone else took and aced?

> Happy MLK Weekend!

> Roger

> Roger Winters

> Program and Project Manager

> and

> Continuing Legal Education (CLE) Coordinator

> King County

> Department of Judicial Administration

> 516 Third Ave. E-609 MS:KCC-JA-0609

> Seattle, WA 98104

> V: (206) 296-7838

> F: (206) 296-0906

> roger.winters@metrokc.gov

> -----Original Message-----

> From: John Messing [mailto:jmessing@law-on-line.com]

> Sent: Friday, January 12, 2007 4:24 PM

> To: Hickman,Brian

> Cc: legalxml-courtfiling@lists.oasis-open.org; O'Brien,Robert

> Subject: RE: [legalxml-courtfiling] FW: (Microsoft XML Team's WebLog)

> Mixing structured and unstructured content in MS Word

> An alternative is Adobe's XFA format which enables XML schema's to

> generate PDF layout documents. It is ideal for form-based documents.

> This likely will be the document format structure that eNotary will

use

> for the layout of its form-based notary certificates, jurats and

> acknowledgements.

> > -------- Original Message --------

> > Subject: [legalxml-courtfiling] FW: (Microsoft XML Team's WebLog) :

> > Mixing structured and unstructured content in MS Word

> > From: "Hickman, Brian" <Brian.Hickman@wolterskluwer.com>

> > Date: Fri, January 12, 2007 4:53 pm

> > To: <legalxml-courtfiling@lists.oasis-open.org>, "O'Brien,Robert"

> > <Robert.OBrien@cas-satj.gc.ca>

> >

> > After reading Roger Winters and John Messing's posts on embedding

> > structured and unstructured content in a pleading I thought I would

> ask

> > Microsoft's XML team to recommend a method to add structured /

machine

> > readable content to an MS Word document that also contains

> unstructured

> > / narrative content.

> >

> > I am forwarding Microsoft's response for your review.

> >

> > Brian Hickman

> > Attorney

> > Government Relations

> > CT

> >

> > 520 Pike Street, Suite 2610

> > Seattle, WA 98101

> > 206 622 4511 (tel)

> > 206 437 1766 (mobile)

> > brian.hickman@wolterskluwer.com

> >

> > -----Original Message-----

> > From: Brian Jones (OFFICE) [mailto:brijones@exchange.microsoft.com]

> > Sent: Friday, January 12, 2007 1:30 PM

> > To: Adam Wiener; Michael Champion; Hickman, Brian; Steven Goulet;

Doug

> > Mahugh; Gray Knowlton

> > Subject: RE: (Microsoft XML Team's WebLog) : Mixing structured and

> > unstructured content in MS Word

> >

> > Hi Brian,

> > The model in both Word 2003 and 2007 is to allow you to add your

> custom

> > XML markup to a Word document so that it lives alongside the

> formatting

> > and layout information.

> > The validation occurs on your schema on its own, even though there

> > also WordprocessingML whenever you save the file.

> >

> > It's recommended that you leverage the Word structures as much as

> > possible, and only add your own XML markup for persisting semantics

> that

> > can't be captured with the Word model.

> > I would also suggest learning more about the new content controls

> > feature in Word 2007. This allows you to add more structure on top

> > your Word documents. There is a series of blog posts on the Word

blog

> > that cover this, and I just recently blogged about the post that

> covers

> > mapping custom XML to content controls:

> >

http://blogs.msdn.com/brian_jones/archive/2007/01/10/the-power-of-data-v

> > iew-separation-in-your-documents.aspx

> >

> > -Brian

> >

> > -----Original Message-----

> > From: Adam Wiener

> > Sent: Friday, January 12, 2007 12:13 PM

> > To: Adam Wiener; Michael Champion; brian.hickman@wolterskluwer.com;

> > Brian Jones (OFFICE); Steven Goulet; Doug Mahugh; Gray Knowlton

> > Subject: RE: (Microsoft XML Team's WebLog) : Mixing structured and

> > unstructured content in MS Word

> >

> > Adding Doug and Gray as well... XML Bloggers on bcc...

> >

> > Thanks,

> > Adam

> >

> > -----Original Message-----

> > From: Adam Wiener

> > Sent: Friday, January 12, 2007 10:32 AM

> > To: Michael Champion; brian.hickman@wolterskluwer.com; Xml Team

> > Bloggers; Brian Jones (OFFICE); Steven Goulet

> > Subject: RE: (Microsoft XML Team's WebLog) : Mixing structured and

> > unstructured content in MS Word

> >

> > Looping in Brian Jones and Steven Goulet...

> >

> > Can you please take a look at Mr. Hickman's question below?

> >

> > Thanks,

> > Adam

> >

> > -----Original Message-----

> > From: Michael Champion

> > Sent: Thursday, January 11, 2007 8:29 PM

> > To: brian.hickman@wolterskluwer.com; Xml Team Bloggers

> > Subject: RE: (Microsoft XML Team's WebLog) : Mixing structured and

> > unstructured content in MS Word

> >

> > Thanks for your inquiry. The people on this list are not Word

> experts,

> > so I'll try to find someone in the Office team who can answer. (Or,

> if

> > one of you on the XML team does know the answer, feel free to chime

> in!)

> >

> > I know that you can edit documents that conform to a custom schema

> > Word 2003 and 2007.

> > http://blogs.msdn.com/brian_jones/archive/2006/01/25/517739.aspx

> > http://msdn.microsoft.com/msdnmag/issues/03/11/XMLFiles/

> >

> > I don't know about mixing structured (custom schema) and

unstructured

> > (default Word schema) in one doc, however, if that is what you are

> > asking. Please let me know if you don't hear back from someone in

> > Office in a timely manner and I'll try to follow up.

> >

> > Mike Champion

> >

> > > -----Original Message-----

> > > From: brian.hickman@wolterskluwer.com

> > [mailto:brian.hickman@wolterskluwer.com]

> > > Sent: Thursday, January 11, 2007 5:42 PM

> > > To: Xml Team Bloggers

> > > Subject: (Microsoft XML Team's WebLog) : Mixing structured and

> > unstructured

> > > content in MS Word

> > > Importance: High

> > >

> > > I am a member of OASIS LegalXML's Electronic Court Filing

Technical

> > Committee

> > > and an attorney with CT Corporation. The goal of the technical

> > committee is

> > > to develop standards to file documents electronically with courts.

> > Today,

> > > most documents produced by the legal industry are produced in MS

> Word.

> > > Unfortunately, today, a human must read the document at the

> courthouse

> > to

> > > extract data from the document to populate the court's case

> management

> > system.

> > > My question is: Can we integrate content that conforms to a

custom

> > data model

> > > into MS Word such that structured content and unstructured content

> can

> > reside

> > > in the same document? If the case management system could extract

> > content

> > > from an MS Word file that conformed to a customize data model (i'm

> > thinking

> > > along the lines of adding an MS Scheme that matched the court's

> > requirements)

> > > then an automated process could extract data directly from the MS

> Word

> > file.

> > >

> > > If you look at a legal pleading you will see that some sections of

> the

> > > document are structured and conform to a data model that conforms

> a

> > set of

> > > rules expressed by the court in narrative format and some parts of

> the

> > > document are almost unstructured, such a a paragraph of narrative.

> > >

> > > What approach would you recommend to allow attorneys to use the

tool

> > they are

> > > familiar with, MS Word, and still embed some machine readable

> content

> > within

> > > the MS Word document?

> > >

> > > Thank you

> > >

> > > Brian Hickman

> > > ----------------------------------

> > > This message was generated from a contact form at:

> > > http://blogs.msdn.com/xmlteam/default.aspx

> > > It was submitted by Brian Hickman

(brian.hickman@wolterskluwer.com)

> > >

> > > Your contact information was not shared with the user.

legalxml-courtfiling message