Last month I discussed the allegation that the results of Russian parliamentary elections violate mathematics. The figure below shows the distribution of the percent of the vote for parties among election precincts. It travelled over hundreds of blogs during past couple of weeks and was accompanied by comments that the distribution for the United Russia party violates a fundamental law of nature because it is non-Gaussian. In that previous article I argued that there is no reason for the distribution to be Gaussian, however, a commentator challenged me to show non-Gaussian distributions in US elections and so I took up the challenge.
I decided to look at 2008 Republican primaries mainly because this was the last election I voted in. The primaries differ from national elections because different states hold their vote on different dates. Some candidates drop out during the race and it complicates the analysis. However, 21 states held elections the same day, called Super Tuesday. Since almost half of the nation votes the same day it looks like a National primary. The most complete elections results database I could find is Dave Leip's Atlas of U.S. Presidential Elections (http://uselectionatlas.org/). It does not have precinct level results for the election in question but has results by county for 19 out of 21 Super Tuesday states (except for Alaska and North Dakota). I computed the distribution of the percent of the vote for four major candidates among 1,162 counties. See Table 1 and Figure 2.
As you can see in Figure 1, the Huckabee distribution has two equal peaks at 15 and 35 percent. The drop between peaks is half the peak’s height. The McCain distribution has one peak at 35% and another at 80%. Between these peaks, the distribution drops almost to zero. Romney has one peak at 25% and another at 90%. Paul got an exponential distribution which resembles the one of “Yabloko” in Russian elections. There is little Gaussian about these distributions. Apparently, American elections also “violate Gauss’s groundbreaking work on statistics.”
Another issue brought to light by bloggers is that there are some spurious peaks at 50% and other multiples of 10 (see Figure 1). However, when you go to precinct-level results – you notice that in many precincts very few people voted, as little as 1 in some of them. When 2, 4, 6, 8, or 10 people voted you can get a 50% result, but never a 49% result. Can it explain it all? I plan to address this question in a future article.