[Image courtesy of funandfunction]
Ethan Roeder, data director for Obama for America, has a terrific op-ed in the New York Times entitled "I'm Not Big Brother" responding to various misconceptions about the role of big data in campaigns.
There's lots of good stuff in there - he had me at "malarkey" - but I want to focus on two grafs about midway through that neatly describe what "big data" is all about:
There are two categories of online data: information users provide explicitly, and stuff they communicate implicitly through their behavior. The explicit data includes e-mails and comments that users share directly. The implicit data comes from "click tracking," which tells a campaign what buttons are getting pressed and how often. Combined, these two categories of data allow a campaign to put together an online experience that will resonate with as many people as possible, but also to customize the experience so that you are more likely to encounter content that's relevant to you.
At times it might seem like sorcery to the recipient of a targeted e-mail, but it's just a product of two simple factors: remembering who you are and remembering what you like.
Roeder then goes on to talk about how this data is married to the age-old science of modeling to give campaigns an educated guess about who will turn out to vote:
Cheaper and more plentiful computing power allows campaigns to process far more information than ever before to look for patterns, trends and correlations ... campaigns use statistical techniques to apply these assumptions to individual records in the voter file rather than stopping short and simply assuming that entire sections of the electorate will behave identically.
Obviously, this kind of information - properly used - is a powerful tool. I wonder if election officials shouldn't be doing some of this as well; not to predict for whom a voter will cast a ballot but in what form and when. As more and more jurisdictions move to multi-modal election systems with a blend of in-person and mail voting, both on and before Election Day, having some intelligence about the demand by voters for election services will be invaluable in the effort to allocate ballots, machines and materials in a way that serves the greatest number of voters in the most cost-effective manner.
Ethan's right that campaigns don't have a "giant blue computer sitting on the 101st floor of a sleek skyscraper, surrounded by bubbling tubes of illuminated liquid, spitting out the manifest destiny of America's voters" (the above image will have to do) - but they do have access to data and analysis that could (and should) be invaluable to the next generation of election officials.