Nate Silver: Rock Star

Nate Silver, the data-wonk-cum-blogger-cum-NYT-contributor-cum-statistical-demi-god-cum-media-darling of this week’s election…well, he got it right (compare his “forecast” with his “nowcast”).  But what, exactly, did he get right, and how did he do it?  The media and various pundits are enamored with Silver’s moxie and uncanny accuracy in predicting the election’s outcome.  But it appears to me that the big winner of this story is cold, hard, sober data analytics.  His blog is a playground of interesting, practical, well-founded analysis of data, data, and more data.  This is big data, huge data, culled from multiple sources and giving specific state-by-state snapshots of the situation on the ground over time.  This is no trivial task to synthesize all this information into a set of predictions.

Why are we so enamored when someone uses math productively?  Think of it this way:  in popular culture, when someone is good at math, people say: “wow, you must be really smart”.  But when someone is good at, say, history, people say: “wow, you must really like history, and you probably studied a lot to get to be so knowledgeable about history.”  It’s a bias, plain and simple, against the kind of basic quantitative literacy that will only become more important to this nation and the world over time.  How can we evaluate election results, pollution data, SOL outcomes, or any other quantitative information without a basic foundation in, and respect for, general quantitative literacy?

So what did Silver get right?  He looked at the whole range of polling data available over time, and worked to evaluate the quality of that data by examining potential error/bias embedded in it.  The key was that he aggregated data from multiple sources and finessed an understanding of the sources of (and magnitude of) the error in the aggregated dataset.  This is quite contrary to John Oliver’s Twitter-based, real-time approach to prognostication (starts at about 2:45 of the clip). Silver’s brand of sober analysis leads to forceful predictions like this (from his blog, 11/3/12):

To be exceptionally clear: I do not mean to imply that the polls are biased in Mr. Obama’s favor. But there is the chance that they could be biased in either direction. If they are biased in Mr. Obama’s favor, then Mr. Romney could still win; the race is close enough. If they are biased in Mr. Romney’s favor, then Mr. Obama will win by a wider-than-expected margin, but since Mr. Obama is the favorite anyway, this will not change who sleeps in the White House on Jan. 20.

My argument, rather, is this: we’ve about reached the point where if Mr. Romney wins, it can only be because the polls have been biased against him. Almost all of the chance that Mr. Romney has in the FiveThirtyEight forecast, about 16 percent to win the Electoral College, reflects this possibility.

Yes, of course: most of the arguments that the polls are necessarily biased against Mr. Romney reflect little more than wishful thinking.

It is both unfortunate and energizing to think that the general public (and the pundits in particular) might not fully appreciate how math works, how practical it can be, and why a systematic consideration of not just the mathematical operations, but also the quality of the input data, can lead to better predictions, or better policies, or better profits, or better quality of  life.  Unfortunate and frustrating, perhaps.  But it’s also an important opportunity for us, as academics and people in the science/technology/mathematics literacy world (we are, after all, in higher education), one that should energize us with the challenge that lies ahead.

A Bold Proposal:  Let’s develop a course on information literacy, required for every student at UVa, and continuously measure the outcomes and impact of that course on how students approach their academics and their life.  An educated, global citizenry requires nothing less.

