Share with your friends










Submit

Analytics Magazine

Viewpoint: Did Nate Silver beat the tortoise?

January/February 2013

Arnold BarnettBy Arnold Barnett

In making election forecasts for the FiveThirtyEight blog (538) at the New York Times, Nate Silver uses a statistical model that is subtle, sophisticated and comprehensive. Real Clear Politics uses a shallow approach to forecasting that could have been devised by a statistical Forrest Gump. But which forecaster better predicted the results in the 2012 presidential election? Did the intellectual tortoise hold its own against the hare?

From a conceptual standpoint, it should have been no contest. In an approach that would make statisticians shudder, Real Clear Politics (RCP) estimated the Obama/Romney difference in a given state by the simple average of differences in recent polls. Differences in sample sizes were ignored, the word “recent” was defined differently in different states, undecided voters were simply excluded, and evidence that some polls skew toward Republicans and others toward Democrats got no weight. The 538 model, in contrast, avoided all these limitations, and took account of correlations among outcomes in similar states and the demographic makeup of each.

From Theory to Practice

But how did the final state-by-state predictions under the two approaches compare in accuracy? RCP only made forecasts in 30 of the 51 states (including the District of Columbia), but these included all swing states and all large states. And, at first blush, it might appear that the race between the two methodologies in the 30 states was (in the familiar phrase) too close to call.

The most obvious dimension for comparison is the bottom line: Did the forecast in a given state correctly identify the winner there? By that standard, both methods did very well: In 29 of the 30 states, they agreed who the winner would be and that candidate actually won. (Complete data tables will appear in a longer version of this article in the February 2013 issue of OR/MS Today.) In Florida, neither forecaster made a correct forecast: RCP erroneously projected a narrow Romney victory (1.5 percentage points), while 538 projected an exact tie (and thus abstained from forecasting). Obama carried Florida by 0.9 percentage points. We can say, therefore, that 538 scored a partial victory over RCP in one of 30 states, but that is hardly a decisive advantage.

As for the absolute forecast errors in the various states, the results were once again similar. The mean absolute error over the 30 states was 2.87 percentage points for RCP and 2.25 for 538. However, there is a “blue state bias” among the 30 states: Romney carried only 27 percent of them (eight out of 30), while he captured 47 percent (24 out of 51) in the entire nation. When an adjustment is made for this bias, the mean absolute error becomes 2.57 points for RCP and 2.33 for 538. This revised difference of one-quarter of one percentage point is hardly decisive.

On the Other Hand

Yet this aggregate analysis is oblivious to the central dynamic of the 2012 election. Given the realities of the Electoral College, the candidates and everyone else recognized that the outcome would be determined by what happened in about a dozen “swing states” that either candidate could plausibly win. In the other states, the winner was a foregone conclusion so there was little campaigning and little interest in polling results.

Under the circumstances, a comparison between RCP and 538 should focus primarily if not exclusively on their accuracy in swing states. RCP identified 11 states as “toss up” just before the election: Colorado, Florida, Iowa, Michigan, New Hampshire, Nevada, North Carolina, Ohio, Pennsylvania, Virginia and Wisconsin.

Within these states, the two approaches differed markedly in performance. 538 outperformed RCP in absolute forecast accuracy in all but one of the 11 swing states (Ohio). Both forecasters were on average more favorable to Romney than the actual voters, but the net “bias” was only 0.76 percentage points for 538 over the 11 states as opposed to 2.44 points for RCP. That difference of 1.68 (2.44-0.76) points is especially noteworthy because regression analysis makes clear that the 538’s estimates about Obama’s performance were consistently about 1.5 points higher than those of RCP. Again and again, this adjustment was vindicated by the swing-state results: RCP underestimated Obama’s actual vote share, while 538 eliminated roughly 75 percent of the underestimation.

In the 19 states out of the 30 originally compared that were not swing states, 538 and RCP performed about equally well, which is why statistics based on all 30 states yielded less disparity between the two approaches than the swing states alone. It could be that Obama outperformed the swing-state polls upon which RCP relied because of the major voter-turnout drives that his campaign undertook in those states, which brought many people to the voting booths whom pollsters had not included in tabulations about “likely” voters. In the other states, the Obama campaign may not have waged such efforts, so no comparable “surge” occurred.

Nate Silver would be the first to agree that his state-by-state forecasts were correlated, and that circumstance stymies assessments of whether his swing-state victory over RCP was statistically significant. In effect, he made an all-or-nothing bet on the premise that the polls underestimated Obama’s strength in swing states: Had this premise been wrong, his 11-1 victory over RCP could easily have been a 12-0 defeat. Yet uncertainties about how to define statistical significance cannot obscure the fundamental point: 538 did extremely well in 2012 in those states where accuracy was most important.

Final Remarks

So how does it all add up? Under the Occam’s Razor principle, there is a clear starting preference for simple models over more complicated formulations. A complex model must justify its intricacy by offering more accurate information than a simpler counterpart; moreover, this added information should arise in places where it is most needed. In the present setting, the question is whether Nate Silver’s 538 model outperformed the straightforward RCP method to an extent that makes 538 the wiser choice, even if the less transparent one.

Readers can reach their own judgments, but because of the results in the swing states, the author believes that 538 met the test for superiority just posed. While the tortoise catches up with the hare in the nursery stories, it seems here that the hare won hands down. But the outcome does not contradict Aesop’s fable because, far from being lazy, the 538 hare ran the race as hard as it could. And, if the evidence is any guide, it is very much a world-class runner.


Arnold Barnett (abarnett@mit.edu) is the George Eastman Professor of Management Science at the MIT Sloan School of Management. His research specialty is applied mathematical modeling with a focus on problems of health and safety. Barnett is a senior member of INFORMS.

business analytics news and articles

Analytics Blog

Electoral College put to the math test


With the campaign two months behind us and the inauguration of Donald Trump two days away, isn’t it time to put the 2016 U.S. presidential election to bed and focus on issues that have yet to be decided? Of course not.


Headlines

Gaining distribution in small retail formats brings big payoffs

Small retail formats with limited assortments such as Save-A-Lot and Aldi and neighborhood stores like Target Express have been growing in popularity in the United States and around the world. For brands, the limited assortments mean greater competition for shelf-space, raising the question of whether it is worth expending marketing effort and slotting allowances to get on to their shelves. According to a forthcoming study in a leading INFORMS scholarly marketing journal, Marketing Science, the answer is “yes.” Read more →

Cognitive computing a disruptive force, but are CMOs ready?

While marketing and sales professionals increasingly find themselves drowning in data, a new IBM study finds that 64 percent of surveyed CMOs and sales leaders believe their industries will be ready to adopt cognitive technologies in the next three years. However, despite this stated readiness, the study finds that only 24 percent of those surveyed believe they have strategy in place to implement these technologies today. Read more →

How weather can impact consumer purchase response to mobile ads

Among the many factors that impact digital marketing and online advertising strategy, a new study in the INFORMS journal Marketing Science provides insight to a growing trend among firms and big brands: weather-based advertising. According to the study, certain weather conditions are more amenable for consumer responses to mobile marketing efforts, while the tone of the ad content can either help or hurt such response depending on the current local weather. Read more →

UPCOMING ANALYTICS EVENTS

INFORMS-SPONSORED EVENTS

Essential Practice Skills for High-Impact Analytics Projects
Sept. 26-27, Executive Conference Center, Arlington, Va.

Foundations of Modern Predictive Analytics
Oct. 2-3, VT Executive Briefing Center, Arlington, Va.

2017 INFORMS Annual Meeting
October 22-25, 2017, Houston

2017 Winter Simulation Conference (WSC 2017)
Dec. 3-6, 2017, Las Vegas

CAP® EXAM SCHEDULE

CAP® Exam computer-based testing sites are available in 700 locations worldwide. Take the exam close to home and on your schedule:


 
For more information, go to 
https://www.certifiedanalytics.org.