To predict an election win for Donald Trump you need three pieces of information:
- How people plan to vote – for Donald Trump or somebody else.
- Where they plan to vote – because only the swing states matter in the US electoral college
- Whether they actually do vote – because around 40% of people don’t vote – either because of lack of motivation, not enough information on where and how to vote, or because it’s been made hard or impossible by voting laws (e.g. requiring a specific type of ID to vote).
I’m a great fan of social listening – in fact I’ve been using it daily for 10 years on scores of clients. But it’s simply not up to these three challenges.
Who are you listening to?
Firstly we can’t tell anything about whether people are able to vote. To do that you’d need a database of everyone in the USA who is registered to vote – and a reliable way to cross-reference them to social media accounts.
There are ways to do this, and doubtless the Democrats (who are years ahead of the Republicans in this field) have done some of this. Political parties have access to large voter databases, lots of data on how people have voted in the past, and models of how different groups are thinking about the election. But it’s hard, messy, and nobody has published results in the public domain.
It’s worth noting at this stage that turnout is probably where the state level polls got the election wrong. Trump supporters were more enthusiastic, and Clinton supporters less enthusiastic, than expected. This is the best explanation for why both campaigns were surprised that Clinton lost.
Where: On Twitter, nobody knows if you’re a dog (or in North Dakota)
Secondly social media geographical data isn’t very accurate outside Facebook. And this Facebook data is largely private. So while we can broadly tell if, for instance, most Trump supporting discussion is in a swing state like New Hampshire or in neighbouring safe Democrat Massachusetts, we can’t do it with enough accuracy to predict results in a close election where a change in roughly 1 in every 1,000 voters (in the right states) could have changed the election result. Again, if you combine social data with an external database, you can get more accurate answers.
Twitter isn’t everything
Thirdly Twitter isn’t representative of the broader population. Virtually all social data used for these social analyses is from Twitter – which is only used by 67m Americans, roughly 21% of the population. It biases young, and generally wealthier, things that are correlated with being Democrat. And while we don’t know much about how passive Twitter users (i.e. people who never tweet and just read other people’s) are different from active users, it would be surprising if they were identical.
There are ways to get more representative data (though still a long way from perfect), not least through Facebook’s excellent Pylon product, and various proprietary techniques which we use for clients at Ogilvy (which allow re-weighting to the total population), but again none of these analyses use it.
So, yes, Social analytics can be very powerful in understanding what people are saying. And it can be very powerful in testing messages, honing them and understanding how people hear them and react to them. But back of the envelope analyses, without the most rudimentary checks for robustness, don’t cut it.