Wednesday, March 28, 2007

H.R.811: Fact & Friction – Part III

By Mark Lindeman, Ph.D. and Howard Stanislevic, Research Consultant

In Part I and Part II of this series, we pointed out that the post-election audits provided for in H.R.811 are too small to achieve a high level of statistical confidence that close races for US House seats will be correctly decided. While this observation is actually in agreement with a letter being circulated by Rep. Holt’s office signed by nine prominent experts, we also pointed out that this letter actually overstates the confidence level of Holt’s audits by not taking variations in precinct size into account. The same letter also stated:

“In truth, it may be that attempting to prevent an ‘unacceptable’ level of error on electronic voting machines through audits is too administratively burdensome.”


Put that way, frankly, audits seem pretty useless: if attempting to prevent unacceptable error is just too much of a burden, then what is the point? But on a close reading, the intended point here is subtler and more useful: while it would be ideal to confirm that every vote count is accurate within a fraction of a percentage point, it is much easier to confirm that at least every outcome is correct. Unfortunately, the letter stops short of recommending an audit protocol that actually confirms outcomes. We think that election audits should not only indicate the general accuracy of the counts, but provide a solid basis for believing that the winners really won. Getting the winner wrong is the “unacceptable” error, in most people’s minds – and we believe that it can be prevented without undue administrative burden.

In a nutshell, we believe – and will show below – that the basic problem with the H.R.811 audit is simply the misallocation of resources. By making smarter choices, the country can achieve high confidence in the outcomes of almost all federal races with about the same amount of count-auditing effort. (We say “almost” because in extremely close races, there can always be grounds for controversy about what votes should be counted. Verifying the substantial accuracy of the count isn’t the only task – but it is a very important one.)

Looking too hard in all the wrong places

First, we decided to examine all federal elections in the last three cycles (2002 through 2006) – the presidential race, elections for all 100 Senate seats, and almost 1300 contested House races (about 575,000,000 total votes cast) – to explore the consequences of H.R.811’s quirky allocation of audit resources. For each race, we estimated the size in hand-counted votes of an H.R.811 audit, and the probability that this audit would detect hypothetical outcome-altering vote miscount, using the precinct size adjustment and the other assumptions we made in Part II of this series. We chose to measure the audit size in votes, instead of the number of precincts, because precincts come in many different shapes and sizes. Even using Holt’s methodology, some precincts will require only one race on their ballots to be audited, while others may require audits of all three federal contests. The cost of doing the audit is therefore not necessarily proportional to the number of precincts audited, but is much more closely related to the number of accurately hand-counted votes.

We then figured the size of conducting what we call a 95% SecureAudit in each race: an audit large enough to achieve 95% confidence of detecting outcome-altering miscount under the same assumptions as we have used in evaluating the H.R.811 audits. The closer the race, the larger the necessary audit size. (Some folks call this idea a “probability-based audit,” as in, “based on yielding at least an ‘X’% probability of detecting outcome-altering miscount.” We don’t care much what the audits are called, as long as the idea is clear.) We did the same calculations for a 99% SecureAudit.

(The fine print: these estimates are crude because we do not use actual precinct-level data, but they do consider important variables such as the total number of precincts in each state and the average number of precincts per Congressional District based on the statewide totals. We don’t think that the estimates are biased for or against any particular audit approach. To detect widespread systemic corruption, for all these audits we assume that at least one precinct per county is randomly audited, as required under H.R.811.)

Over these elections, a 95% SecureAudit would support much greater confidence in election outcomes than H.R.811 audits, at somewhat lower cost. Across all three election cycles, H.R.811 audits would mandate manually auditing about 20.3 million votes (about 3.5% of the total vote in these elections).

Here is a table showing the number of races with outcomes that would NOT be confirmed with the H.R.811 audit for various confidence levels:


Unconfirmed Outcomes w/H.R.811 Audit
Confidence Level# of Races
99%238
95%162
90%135
50%49

In 135 of the races, the estimated probability of detecting outcome-altering miscount would be less than 90%. In 49 races, the probability would be less than 50%. These low-confidence races include not only the very closest races, but some in which the winning margin is over 4%. They even include at least one race (NH-CD1) in which the winning margin is over 7%, but the number of precincts – and, therefore, the audit size – is unusually small. (We’re not saying that these races are likely to be decisively miscounted. We are saying that if one of them were decisively miscounted, an H.R.811 audit very well might miss the problem.)

By comparison, for an estimated total SecureAudit size of about 18.7 million ballots, over 7% less than in H.R.811 audits, we could have attained 95% confidence of detecting outcome-altering miscount for all of these races, as shown in the following table:







SUMMARY OF AUDITED VOTES
RacesHR811 95% Conf.
SecureAudit
PRES '044,426,2933,681,772
SENATE '021,835,4162,063,447
HOUSE '022,521,6262,556,846
SENATE '042,964,9541,691,639
HOUSE '043,661,6572,925,976
SENATE '062,098,3751,725,986
HOUSE '062,774,2094,103,394
TOTAL20,282,53018,749,060


Thus, high confidence levels can be achieved not by mandating “burdensome” audits across the board, but by shifting resources used in the H.R.811 audit from races and precincts where they aren’t needed to confirm the outcome, to those where they are.

Furthermore, we could attain 99% confidence for every audit in every race by auditing a total of about 22.7 million ballots, about 12% more than under H.R.811:







SUMMARY OF AUDITED VOTES
RacesHR81199% Conf.
SecureAudit
PRES '044,426,2934,449,975
SENATE '021,835,4162,457,639
HOUSE '022,521,6263,142,558
SENATE '042,964,9541,933,563
HOUSE '043,661,6573,507,100
SENATE '062,098,3752,146,380
HOUSE '062,774,2095,100,798
TOTAL
20,282,53022,738,013


If counting those additional votes to get to 99% confidence seems a bit excessive, consider that the above numbers do not include any of various optimizations that could be used to achieve higher confidence levels if the States were permitted to do so. We will say more about this issue below.

Why are 95% or 99% SecureAudits about the same total size as the hit-or-miss H.R.811 audits? Because, as we’ve said, the H.R.811 audit throws audit resources into races where they aren’t especially needed. Consider the 2006 California Senate race, which Dianne Feinstein won by about 24 points over Republican challenger Dick Mountjoy. Assuming for a moment that Mountjoy actually won this election, there would have to be 20% miscount favoring Feinstein in almost half of California’s 21,000+ precincts. In principle, a truly random sample of just 10 precincts would let us detect such massive miscount with about 99.8% confidence. Our SecureAudits use 58 precincts because California has 58 counties, so they achieve better than 99.9999999999999% detection probabilities. California law mandates a 1% audit (about 210 precincts) – let’s just say that would be a lot of 9s! But H.R.811 says, in effect, that even a 2% audit is too lax – we “need” a 3% audit – presumably to bolster public confidence. This mandate would hamstring California and other states by misallocating resources that should be used to confirm other races in need of stringent audits.

For instance, H.R.811 would settle for a 3% audit in small, competitive races such as in California’s 4th district. (We have no reason to think that this race was actually miscounted; our question is what sort of audit would be needed to justify confidence that it wasn’t.) The race in CA-CD4 wasn’t even all that close, by H.R.811 standards: the winning margin was 3.2 points, qualifying only for a 3% audit. However, a 3% audit in a California congressional race amounts to about 12 precincts – an audit size that would confer only about 45% confidence of detecting outcome-altering miscount. In Connecticut’s 4th district, which had a similar margin, a 3% audit would have counted only five precincts – yielding about 21% confidence. So, in the interest of public confidence, should the federal government pay to audit hundreds more precincts in an uncompetitive statewide race, or should it put some of that money into the small and close races? Which is a better use of money? Which is a better use of election officials’ time? And which is more administratively burdensome?

Notice that in 2006, the SecureAudits end up being larger overall than the H.R.811 audits. That is a good thing: it means that more races were competitive in 2006, and SecureAudits would have done a much better job of confirming the results. We don’t advocate auditing on the cheap; we advocate auditing intelligently in order to achieve high confidence across the country.

Choices, we need choices…

Another way to increase the confidence level (without increasing cost) is to randomly audit smaller units such as individual DREs or optical scanners instead of entire precincts. This approach can be especially helpful in states with fewer, larger precincts such as New Hampshire. For example, if there were two scanners per precinct and 200 precincts, randomly auditing 16 scanners instead of 8 precincts in a race with a 10% margin would achieve 99% confidence as opposed to only the 90% confidence achieved by selecting whole precincts. In each case, about 4% of the votes would be counted but the larger number of smaller audit units results in a higher confidence level. (Of course no particular ballot type should be excluded from an audit, and this is another area where H.R.811 could stand some improvement. It’s very hard to understand how the bill deals with absentee ballots!)

A purely random audit has some limitations. It doesn’t “care” whether a precinct is large or small, whether the apparent winner did surprisingly well or poorly. Doing away with random audits would be a terrible mistake: properly done, they provide an excellent check on the overall accuracy of the system. However, to detect possible concentrated miscount in particular races, it may be useful to add a “challenge” element to the audit. Losing campaigns could draw up lists of precincts whose results they deem most suspicious; immediately after the random sample is selected, campaigns could choose additional precincts to be audited. If some attacker did attempt to switch 20% of the votes in 5% of the precincts, probably any savvy analyst would choose at least one of those precincts. So the attacker may choose to steal fewer votes in each of more precincts, but that would increase the attack’s visibility to the random part of the audit. In simulations, we have found that adding even as few as 5 or 10 “challenge” precincts often does as much to improve the detection as quadrupling the random audit size! For instance, in Connecticut-CD4 in 2006, we estimate that auditing about 10% of precincts – evenly divided between random and challenge precincts – could yield the same 99% confidence as randomly auditing about 48% of precincts. Take that result with plenty of salt, but still, it’s interesting. It also appears that auditing larger precincts more heavily than smaller precincts could be substantially more efficient than treating them all equally.

There are other suggestions about how to improve upon the H.R.811 audits, and we won’t try to do justice to all of them here. Suffice it to say, we see no reason to settle for H.R.811’s 3%/5%/10% audits. Unfortunately, not only does H.R.811 not implement such suggestions, but it isn’t entirely clear whether H.R.811 would even allow individual states to implement them. The bill does contain language that allows the National Institute of Standards and Technology (NIST) to approve alternative audit methods that are “at least as effective.” Obviously we think that a SecureAudit is much more effective than the H.R.811 audit overall – but we can imagine the lawsuits over whether it is ever acceptable to audit less than 3% of precincts in any federal race, no matter how large or how uncompetitive. Let us be clear that we support vote-count audits even in uncompetitive races, and we don’t have a fixed position on what the minimum audits should be. But we certainly don’t think that large minimums in some races should be allowed to discourage efforts to confirm the outcomes of other races. Above all, we don’t think that Congress should endorse the premise that audits that often yield less than 50% confidence – sometimes less than 20% confidence! – of detecting outcome-altering miscount are good enough. Instead of crossing our fingers that activists around the country will be allowed to fix the audits state by state, wouldn’t it be smart to do better at the federal level right now?

Click here to learn how to tell the House Administration Committee to allow the states to conduct high-assurance, probability-based audits like the SecureAudits proposed above.

No comments: