By Pete Holley, Head of Measurement, Realtime
The oldest cliché in advertising is “Half of my advertising spend is wasted, but I don’t know which half.” When social media advertising first appeared, a big part of its appeal was that it provided an answer to that question, even if that answer was cautious and incomplete.
Social media advertising has always generated a huge amount of data, and we have devised increasingly sophisticated ways of looking at that data. We have gone from numbers in spreadsheets – with a graph if you’re lucky – to dashboards that can slice and dice and select and plot the data in endless permutations. However, being able to look at the data in a lot of different ways doesn’t necessarily tell you what it means. Social media data is noisy because it’s derived from human behaviour, and human behaviour can be influenced by different factors, and those factors vary from place to place, person to person, and time to time.
In order to answer our fundamental question, we have to differentiate the signal from the noise. We can look at differences in performance and compare them with differences in advertising. Many people look to “data science” for these answers, but data science is a young discipline, and many data scientists are more concerned with preparing and presenting data than with understanding what it really means and what we can infer from it.
We might see that if we advertise in London, but not in Birmingham, we get more sales in London. You can easily see that in a graph, or in a GIS map if you’re being clever. But is the difference in sales because of advertising, or because of some other difference between London and Birmingham? Or is it just noise and not a real difference at all?
There are ways of answering these questions (and of course, some data scientists already use them.) Disciplines like psychology and sociology, amongst others, have been addressing related issues for years using a scientific approach.
In science, we don’t just try to find just-so stories that give us post-hoc explanations for things, like “Sales went up, we spent more money on Facebook, that must be why.” Instead, we measure things, we create models of how they’re related, we use the models to make predictions, and then we check if the predictions are correct. If the predictions are wrong, we need to change our model.
The discipline of statistics has evolved alongside this approach as a way of executing the measurement and checking parts of the process. Typically, you’d be checking for a relationship between things you’re measuring.
For many years, the checking was usually done using a range of techniques based on the idea of looking at your results and working out the probability of getting those results if there is really no relationship between the things you’re measuring. If the probability is low, we infer that there is a relationship after all, and our model is probably right. This rather roundabout approach is called null hypothesis significance testing (NHST). It’s counter-intuitive and confusing if you’re not used to it – and many professionals get confused by it – but it’s still useful if done properly.
But other approaches are available. In the 18th century, a church minister called Thomas Bayes worked out a way of calculating probability based on belief rather than frequency. His work is the foundation of Bayesian statistics. Modern Bayesian approaches use the vast computation power of modern computers and machine learning to evaluate the probability of our models being correct more directly, without using null hypotheses, often by running thousands of simulations and using each iteration to fine-tune the model.
These statistical approaches can help us separate the signal and the noise and work out if differences are because of marketing or something else. And they don’t depend on being able to track individual customer journeys, so they don’t impinge on people’s privacy. Privacy is an issue of increasing concern for people and governments, and using this aggregate data gives you some reassurance that your analysis won’t be banned or invalidated by law or consumer behaviour.
Of course, statistical approaches won’t give you perfect answers that account for every link clicked and every dollar spent. But they will help you understand, better than ever before, how your marketing impacts user behaviour, and what other factors have an effect. By using these techniques, we can get closer than ever before to answering the perennial question; “which half of my advertising spend is wasted?”