By predicting the outcome of the US elections correctly in 50 out of 50 states, Nate Silver of the NY Times' FiveThirtyEight blog has managed to convince even the most sceptical data deniers of his prediction models. So much so that his perfect prediction started a twitter trend (#natesilverfacts) and led to him being labelled a witch. So how impressive was this feat really? Is Nate Silver really a wizard from the future aiming for world domination through the power of numbers? Let's use some stats to assess his stats!
Let's start by toning down Silver's amazing feat of predicting the election outcomes in 50 separate states. In most US states, the outcome of the election didn't need complex prediction models to come to a reliable estimate of the election outcome: some results, such as in the District of Columbia where over 90% of the population voted Obama, were uncontested. The same goes for other red Obama-voting states as California (59.1%), Hawaii (70.6%), Maryland (61.7%) or New York (62.6%) or blue Romney states as Oklahoma (66.8% voted GOP), Utah (72.8%), Alabama (60.7%) or Kansas (60.0%).
Only in swing states, that could go either way, Nate Silver would have needed his number crunching to decide on a future winner. If we go by the NY Times' numbers, only 7 states were a toss-up between the Democrats and Republicans: Colorado, Florida, Iowa, New Hampshire, Ohio, Virginia and Wisconsin. Treating those 7 states as coin tosses – each outcome has an equal 50% probability – we can test the hypothesis that Nate Silver is a witch, Hwitch, against the competing hypothesis that he is a completely non-magical human being, Hmuggle. If Nate is a witch, we assume he predicts each state's election results correct, witches having a perfect knowledge of all future events. The probability for this happening is expressed in a fancy maths equation like this: p(7 right|Hwitch) - read the equation as: probability of Nate getting 7 right, given that he is a witch. The probability in this case is 100% or 1. But even if Nate is devoid of magical abilities, there is still a small chance he would guess all 7 election results correctly. We can calculate this probability: p(7 right│Hmuggle)=1/2^7 =1⁄128. If we take the ratio of the two, 1/(1⁄128), it seems that is about 128 more likely that Nate is a witch than him being a muggle.
Whatever the truth is about Nate Silver, it appears he's pulled off something pretty extraordinary. Unfortunately for him, he's still one step removed from being the world's best predictor as Paul the psychic octopus managed to correctly predict the outcomes of 8 football matches at the 2010 World Cup. World's best human predictor will have to for now then.
However, as with Paul, Nate wasn't the only person making predictions. Paul only gained the street cred necessary to be taken seriously as a clairvoyant cephalopod after a bout of predicting Eurocup results (and getting one wrong), and the same could be said for Nate Silver. If he hadn't pulled off a similar feat in the previous elections, no one would have paid much attention to his blog this time round. His 2008 prediction was perhaps even more impressive than his latest one: he might have missed Indiana, but got the results for the remaining 10 swing states right.
As polls get about the same amount of coverage (if not more) as the actual elections, there are a lot of people who try to pitch in. Let's take a guess and say there were 50 people trying to predict the state-by-state 2008 election outcomes. Chances that at least someone would get at least 8 of the 11 swing states correct (assuming this would be the threshold to attract the attention of witch hunters) are 1-〖255/256〗^50 = 0.18 (for the reasoning behind this, read David Spiegelhalter's blog on the numbers behind Paul being a completely normal, if not lucky, octopus). So there was an about 1 in 5 chance of at least someone coming up with some remarkably correct predictions.
So we now know that frequentist statisticians would label Silver as a witch, but what about much cooler Bayesians (no bias at all here…)? Bayesian statistics differ from frequentist statistics in that it takes prior knowledge into account when putting a probability on an event. Or: Bayesian statistics is probably a cool branch of stats, but if you know XKCD thinks so too, it's suddenly a lot more probable to be true (the coolness of a specific branch of statistics is conditional on XKCD endorsement).
To calculate the posterior probability of Nate Silver being a witch, we need to know a few things:
- p(W), or the prior probability that Nate Silver is a witch, regardless of any other information. This will depend on the prevalence of witches in Silver's hometown, New York. According to this NY Meetup page, there are 3,023 witches in NY. Considering the population of the whole city (8,244,910 according to the US census), the prior probability of a random person in NY being a witch is 0.0004.
- p(W'), or the probability that Nate is a muggle regardless of any other information, and that's 1 – 0.0004 or 0.9996 in this case.
- p(P|W), the probability of Nate making a perfect prediction, given that he's a witch: 100%, or 1.
- p(P│W'), the probability of Nate making a perfect prediction as a muggle, which we put at 1⁄128 or 0.008 earlier.
- p(P), the probability of making a perfect prediction, regardless of any other information. Using the law of total probability – all probabilities have to add up to 1 or 100% - this is 1×0.0004 + 0.008×0.9996 = 0.0084.
Now that we know all this we can fill out the formula for calculating posterior probability:
That's pretty slim, though at 5%, we can't be sure he isn't a witch. However, going back to the 2008 elections, there were already some suspicions of Nate Silver's potential Wiccan background. If we start with the 0.18 probability we arrived at earlier, the posterior probability of Nate Silver being a witch rises to 0.96 or 96%.
So yes, Nate Silver is probably a witch. Alternatively, you could of course exchange ‘witch’ with ‘statistician’ and conclude with 96% confidence that he’s just very good at his job.