OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

election-services-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Additions to EML to support vote tabulation audits


EML could be of enormous help in improving election auditing, but it does not yet identify standard, interoperable ways to present all the necessary information.  Here is some background on auditing and some specific auditing-related feedback on the first committee draft of EML 6.0 (23 October 2009,
http://lists.oasis-open.org/archives/tc-announce/200910/msg00011.html)

My comments are in the context of post-election tabulation audits, as described at "Principles and Best Practices for Post-Election Audits":

 http://www.electionaudits.org/principles

That represents the current consensus on the state-of-the-art, and would be an appropriate reference to add to the document.

Section 3.2.6 of the EML spec, "The Auditing System", currently mostly describes what I'm familiar with in the United States as a "ballot reconciliation" process, but without using that term.

A different type of audit is a tabulation and ballot interpretation audit, as described in the "Principles" document.  Please also describe this kind of audit as a use case for EML.  This was one of the more commonly cited use cases at the recent NIST workshop on a Common Data Format for Electronic Voting Systems (http://vote.nist.gov/CDF-WorkshopCallForPapers.htm).

I describe this use case in more detail in my paper for that workshop:

 Obtaining Batch Reports and Audits from Election Management Systems: Election Audits and the Boulder 2008 Election - http://vote.nist.gov/cdf-workshop/neal-mcburnett-boulder-paper.pdf

I expect this would entail adding schemas 510 and 520 as inputs to the audit cross-referencing process in figure 2H and discussing their use.  This possibility is alluded to in the final paragraph of the section, but treating this critical case in much more detail would be appropriate.


EML should standardize exactly how to report all the data needed for auditing.  This includes blanks (the absence of any vote in a contest, which is important for calculating the "residual vote"), undervotes (for contests in which voters can select multiple candidates, this may be greater than one), and overvotes.  Currently, blank votes and overvotes are specified in EML, but not undervotes.

Standard elements are also needed for the number of ballots for each ReportingUnit, the number of ballots on which this contest appears in this ReportingUnit ("contest ballots"), and the type of ballots in the ReportingUnit (absentee, in-precinct, early voting, etc).

There are ways to include much of the data via a custom CountMetrics element, but EML should add standards for exactly how to specify them in 6.0.  Otherwise, it is much more complicated to develop auditing tools, or even aggregate data for an audit from multiple counties, each of which may produce their reports using different election software.

Here is a bit more detail on some of those elements.  Note that there are multiple definitions of "undervote" for a contest in which people can vote for more than one winner.  If someone gets to vote for 5 members of a city council, and only votes for 3 of them, I think that is counted in different states as either one undervote, two undervotes or no undervotes.  For "residual vote" calculations (a measure of election quality) we want to know how many ballots have no vote at all for council.  That is often called a "blank" ballot for that contest.  For auditing purposes we want to know how many ballots have that contest on them, for which it is useful to see two undervotes for that ballot.

It appears that the CountMetrics element can only hold a number, so it can't be used for reporting the type of ballots in each ReportingUnit (absentee, etc).  Another method is needed.


I made an initial cut at adding EML 510 output to the open source ElectionAudits software.  It also has support for parsing a number of existing election reporting data formats, so it can now convert to EML 510 from Sequoia, Hart or California's "SWDB" (dbf) and two other formats.

This is based on the California Election Night Data-Feed in XML.  It validates against EML 510 in version 6.0, thanks to generous help from David Webber.

An example EML 510 report from ElectionAudits, with an XSLT file which displays it as an HTML table, is referenced here:

 http://bcn.boulder.co.us/~neal/electionaudits/

In this example I'm currently reporting under and over votes as if they were actual candidates, since I'm not very familiar with XML Schema or the template syntax used in EML.  If someone can show me a good example of EML 510 reports with proper syntax for blank and over votes, I'd appreciate it, and will update the software.


A real-life example of a vote tabulation audit, with all the information needed for public verification of the procedures and results in HTML format, can be found at the web site for the Boulder 2008 audit:

 http://bcn.boulder.co.us/~neal/elections/boulder-audit-08-11/

As an example of the utility of EML for auditing, and as a simple way to test specific EML 510 files for their coverage of the elements necessary for auditing, it would be helpful to have an official example program (in XSLT or some other language) to aggregate EML 510 from multiple jurisdictions into a single comprehensive EML 510 file, suitable for re-tabulating the preliminary results and selecting batches to be audited.


In the vote tabulation auditing community we often calculate the probability that the audit would have found evidence that the wrong winner was declared (if such were the case).  But there are multiple statistical methods and assumptions in use, and election observers want to be able to check the reported audit results according to their own assumptions and data.  In order to do this, it would be very helpful for them to have the results of the tabulation audit available in a standard format.  It would include which precincts or batches of paper were selected for hand interpretation and counting, how many discrepancies were found, what statistical methods were used (e.g. SAFE, NEGEXP, PPEBWR, Kaplan-Markoff), what random selection procedure was used (pseudo-random number algorithms and random seeds used) etc.  This would also allow the public to verify the selection of audit units.

Please add an audit result schema to EML to provide that sort of standardized data.


Another feature needed for auditing contests that span multiple jurisdictions is a standard way to identify standard names for contests and candidates, since again these names can often vary in subtle ways between jurisdictions, as documented extensively at the NIST workshop.  These standard names may be different from the way each jurisdiction wants the names to appear for local reporting purposes, so I'm picturing two elements - the local name and the standard name, for both contests and candidates.

Finally, EML would be easier to use if more examples were provided, and if clearer documentation on the elements and attributes was available.  Many have no documentation at all.

Thank you for your work on EML to date - it is already enormously helpful.

I would be happy to work with you further on defining the exact standard, and can call on my auditing colleagues for help.

--
Neal McBurnett                 http://neal.mcburnett.org/



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]