Predictive analytics: A blend of art and science?

This article, written by Neil Mason, was originally published on Clickz.com and is republished here with permission.ClickZ logo

I have just been reading a booked called “Super Crunchers” by Ian Ayres. It’s an interesting book on how the use of data mining and predictive analytics is becoming more widespread across all aspects of our societies, and is increasingly shaping our lives. He cites a number of different examples where these empirical approaches are able to outperform human experts in their ability to accurately predict the likely outcomes.

I particularly liked his story of an econometrician who was able to predict the expected quality of Bordeaux wine based on a simple regression analysis of weather data. He was able to predict the expected quality of a particular vintage based on just three variables; the amount of rainfall in the winter, the amount of rainfall during the harvest and the average temperature during the growing season. What was interesting for me was not the fact that he was able to make these predictions, but the accounts of the resistance and even hostility that he got from the “wine establishment” for his predictions. The wine experts of the time were very threatened and affronted by the fact that their “art” and “expertise” could be reduced to a simple equation.

Ayers goes on to give a number of other examples in various industries where the growth of data and technology has allowed data mining and predictive analytical techniques to change the rules of the game, from baseball scouting to social policy development and medicine. Quite often in each of these fields there has been resistance to the ascendency to the use of these techniques from the established experts in that field, such as baseball scouts, policy makers, doctors and so on. They would not, or could not accept that such empirical methods could be better than the expertise they had developed over years of training and experience. However numerous studies cited by Ayers have shown that predictive analytics outperforms “experts” in the ability to predict an outcome correctly. That doesn’t mean that predictive techniques always get it right just that they get it right more often than the experts.

In the digital marketing field Ayers uses the example of A/B and Multi-Variate Testing (MVT). The point he makes is that the volume of data and the technology, now allows people to run repeated tests and trials to predict which versions of which element on a page is most likely to be successful in driving the desired outcome. Those of you familiar with the MVT technologies will know that the marketing stance behind them is often that they eliminate the need for subjectivity in the design process. You just come up with some alternative versions and see which one works best. It’s the ultimate tool for overcoming bias and subjectivity of the various stakeholders involved in site development. Who needs usability testing, right?

Ayers’ background is not as a statistician or an analyst but as a lawyer. You don’t immediately think of lawyers as being masters of the empirical universe and why would a lawyer be an expert in number crunching? The interesting point being a lawyer could be similar to being an analyst. Often you are trying to prove or disprove a hypothesis and looking for the appropriate evidence to support your theory or disproves somebody else’s and, for me, this gives rise to one of the fallacies about econometrics and predictive analytics that it is purely a scientific discipline.

Predictive analytics is often as much about art as it is about science. To build a good model you need to have a good understanding of the way that the “system” you are trying to model works. More often than not, at the beginning of the model building process, there is some subjective opinion about what are going to be the likely factors influencing the thing that you are trying to predict. So where do these opinions come from? They usually come from the people who are knowledgeable or experts in that particular field. We sometimes called this the “domain expertise”. If we take the example of the econometrician predicting the quality of wine, the econometrician was also a wine buff so he had some previous knowledge about what the likely factors were that could potentially affect the quality of a particular vintage. His skill was in quantifying it.

In the same way, some domain expertise is needed in the development of good tests. If we look at MVT then the technology can help you determine which the best page design to use is. If you test 4 different versions of an element (say a call to action), then you will get a winner. That “winner” may be the one that you started out with, but it’s still the winner. It doesn’t mean though that it’s the best one, it’s just the one that was best out of the various options that you looked at. There may be a much better option out there which you haven’t tested. Usability experts can potentially provide better insights into what versions are the best ones to test in the first place, and also help to understand why the results have come out the way that they have.

So we need the experts to help us build better models. That expertise may come from years of experience or knowledge gained from understanding the effectiveness of previous models. In either case, there’s room for both the science and the art.

Add your comment