Predictive Analytics

  1. Page 1 of 3
  2. Next

Seeing… or not seeing

This post originally appeared on Applied Insights’ blog. Foviance acquired Applied Insights in November 2008, with Neil Mason joining us as Director of Analytical Consulting. As part of this acquisition, we’ve incorporated Applied Insights’ blog into our own.

When we think about how to evaluate a predictive model the first thing we typically think of is how accurately does that model predict against the (unseen) test data. More often than not though when we develop models our business/research customers want more than that. They want to know how the algorithm got to the predictions i.e. they want to understand the model.

The more transparent predictive methods don’t just predict they also reveal the patterns that underlie them. The two main benefits of this are that

  1. Subject Matter Experts (SMEs) typically on the business/research side - can assess the model’s validity by viewing these patterns, for example as rules or formulae. This way they can see if the inherent relationships make sense. Do they see any potential anomalies in the data that we didn’t pick up when we previously explored it?
  2. And of course the patterns themselves may reveal useful insights. We often find specific segments of interest; demographic groups who have a higher propensity to convert through a given channel, or re-purchasers who have short, but potentially interesting and valuable, buying cycles.

The bottom line is that when we can see what a model is doing we can glean much more from it than the likelihood that the outcome of interest (convert, attrite, default, etc.) will happen.

To be frank most of our projects are like this. This is where Decision Tree methods often win out because the output let’s us visually explore the data to both understand the model and to examine other potential patterns of interest. They may not necessarily give us the most accurate predictions but often the SMEs care more about understanding than predicting. This is a classic trade-off in PA.

There are exceptions to this. The alternative view is that accuracy is paramount and it could be that the winning model is opaque. Neural Network models are a case in point. Depending on the software you are using you might see a ranked list of fields which contribute to the prediction along with the prediction itself and perhaps an associated confidence level. Even if the final network is displayed it doesn’t necessarily explain much more.

For the most part these are the two most typical scenarios however we are currently designing a 3rd type - where opaqueness is the main objective (together with an acceptable level of predictive accuracy of course). We’re talking to a government department who don’t want to have to send sensitive data out and who don’t want our models to reveal any of that information either. So the gist of our approach is that we’ll develop black-box models on our data and let them deploy them on their database. They’ll give us addresses and predictive scores in return but in so doing we won’t know why a particular address was selected.

Anyone living in the UK will understand the political backdrop to this as there have been various high profile cases of data going AWOL (here is the latest one). We are hoping that a somewhat unorthodox application of Predictive Analytics might help the UK government provide a valuable public service without further compromising the confidentiality of its citizens. There’s many a slip twixt the cup and the lip mind you - we’ll keep you posted…

An Introduction to Predictive Analytics, London, 22nd May 2008

This post originally appeared in Applied Insights’ events section. Foviance acquired Applied Insights in November 2008, with Neil Mason joining us as Director of Analytical Consulting. As part of this acquisition, we’ve incorporated Applied Insights’ events list into our own.

Applied Insights ran a one day workshop in Predictive Analytics in association with the Emetrics Marketing Optimisation summit on 22nd May at the Hotel Russell in London. A course outline is below.

Please contact us if you would be interested in joining one of our courses or developing a customised in-house training session on predictive analytics.

Predictive Analytics - course outline

An Introduction to Data Mining and Predictive Analytics is a one day workshop covering the foundations of this innovation marketing analytics discipline. During the course of the day you will gain a thorough familiarisation with some of the key principles and methodologies of data mining and predictive analytics and learn how to apply them to common marketing problems such as:

  • How can I predict campaign response?
  • How do I segment my website visitors or customers?
  • How can I anticipate possible customer defections?

In this one day interactive course we will cover the following topics:

Introduction:

  • What is data mining and how is that different to predictive analytics?
  • How organisations are currently using data mining and predictive analytics across their businesses and to solve particular marketing problems

Processes and implementation

  • How to go about a data mining/predictive analytics project
  • An overview of a standard industry process (CRISP-DM)

Methods and applications

  • An overview of the main types of data mining and predictive analytics applications:
    • Forecasting
    • Segmentation
    • Classification
  • An introduction to main methodologies such as:
    • Time-series forecasting
    • Regression analysis
    • Decision trees (CHAID, CART and so on)
    • Cluster analysis
    • Neural networks
  • Case studies and examples of how these techniques are used and deployed in both online and offline marketing is areas such as:
    • Retention modelling
    • Conversion propensity modelling
    • Visitor segmentation

Web Analytics Congress, Maarssen, The Netherlands, May 2008

This post originally appeared in Applied Insights’ events section. Foviance acquired Applied Insights in November 2008, with Neil Mason joining us as Director of Analytical Consulting. As part of this acquisition, we’ve incorporated Applied Insights’ events list into our own.

At this year’s annual Web Analytics Congress in Holland, Neil delivered a keynote presentation on Marketing Optimisation and Predictive Analytics.

Emetrics Marketing Optimization Summit, San Francisco, May 2008

This post originally appeared in Applied Insights’ events section. Foviance acquired Applied Insights in November 2008, with Neil Mason joining us as Director of Analytical Consulting. As part of this acquisition, we’ve incorporated Applied Insights’ events list into our own.

At this year’s Emetrics Summit in San Fransisco, Neil will be presenting a session in the “Advanced Analytics Track” entitled ‘Cutting through the NOISE: Applications of data mining and predictive analytics’.

The presentation will be looking at the application of techniques such as segmentation and propensity modelling to better understand website visitor behaviour.

Internet Marketing Conference, Stockholm, November 2007

This post originally appeared in Applied Insights’ events section. Foviance acquired Applied Insights in November 2008, with Neil Mason joining us as Director of Analytical Consulting. As part of this acquisition, we’ve incorporated Applied Insights’ events list into our own.

A return visit to Stockholm by Applied Insights this year. This time we’ll be giving a presentation at the Internet Marketing Conference on “Predictive Analytics - Why Bother?”. We’ve also been asked to be on the panel on the subject of Testing and Analysis and have been roped in to moderating a panel session on Web Analytics. Should be interesting…

Emetrics Marketing Optimization Summit, Washington DC, October 2007

This post originally appeared in Applied Insights’ events section. Foviance acquired Applied Insights in November 2008, with Neil Mason joining us as Director of Analytical Consulting. As part of this acquisition, we’ve incorporated Applied Insights’ events list into our own.

At this year’s Emetrics Summit in Washington DC, Neil presented a paper entitled “Cutting through the NOISE: Applications of data mining and predictive analytics”.

Predictive analytics Part 2

This article, written by Neil Mason, was originally published on Clickz.com and is republished here with permission.ClickZ logo

In part one of this series, I examined visitor segmentation, a data-mining technique. Now, let’s look at how data mining can be used to understand important visitor behavior over time.

Quite often when we use Web analytics systems, we focus on what visitors do during a particular visit. The classic conversion funnel is a good example of this trendMost Web analytic systems look at the conversion funnel in the context of a single visit, that is, they report on how people got to page A, then B, then C, and so on within a single visit. This information is useful because it helps identify potential process areas that need improvement. But if we think about those times when a visitor might make multiple visits to a site before a conversion, the classic conversion funnel might not give you a true perspective on what’s happening.Take the example of buying car insurance online. In the U.K., it’s a very competitive business. Consumers typically shop around for quotes and go for the best value proposition. As a result, it’ s very unlikely people will arrive on a site and buy car insurance on their first visit. Maybe they’ll arrive from a search engine, check out the proposition, and bookmark the site for future reference. Maybe later they’ll come back, get a quote, and leave to compare it to other quotes. Hopefully they’ll return to complete the policy application process, and a sale is made.

A generic conversion funnel analysis will contain an amalgam of all three types of behavior: research, quote, purchase. As a result, you’re not seeing a true reflection of your ability to convert opportunity into value unless you analyze visitor behavior over sequences of visits, rather than just within the single visit.

If you work with Web analytics data, you know it’s hard enough to understand what’s going on when examining a person’s behavior in a single visit. Analyzing behavior over multiple visits adds complexity. Here, data mining and predictive analytical techniques come into play.

If we accept (as in the car insurance example) that conversion is often a multivisit process, we must understand the process’s key drivers over time if we are to influence that visitor’s behavior. We must find out what behaviors over multiple visits are most likely to lead to a successful outcome.

Using a decision-tree technique like CHAID can help you understand how different visitor behaviors over multiple visits may increase or decrease the likelihood of converting a browser into a buyer. CHAID, which is highly visual, shows factors that influence conversion in a tree diagram in the order they influence people.

) can help you understand how different visitor behaviors over multiple visits may increase or decrease the likelihood of converting a browser into a buyer. CHAID, which is highly visual, shows factors that influence conversion in a tree diagram in the order they influence people.) can help you understand how different visitor behaviors over multiple visits may increase or decrease the likelihood of converting a browser into a buyer. CHAID, which is highly visual, shows factors that influence conversion in a tree diagram in the order they influence people.) can help you understand how different visitor behaviors over multiple visits may increase or decrease the likelihood of converting a browser into a buyer. CHAID, which is highly visual, shows factors that influence conversion in a tree diagram in the order they influence people.As with the segmentation approach described in part one, data must be in the right shape before an analysis is started. That requires extracting and summarizing data to key activities and events in each visit of the visitor lifecycle. I often think that data mining and predictive analytics are part art, part science. The art requires possessing the right data in the right format for algorithms to provide meaningful and useful results. In these days of automated analytics, anyone can produce a model. It’s a question of whether the model is good or not.

In working with these techniques, we commonly find there are a small number of highly influential conversion drivers over multiple visits. Naturally those drivers vary from site to site, but the importance of time is usually one thing they share in common. The time between the first and second visit, and the second and third visit, and so on, are quite often a good predictor of the subsequent outcome.

As the need to tune the online marketing processes continues, organizations must add capabilities to their analytics tool kit. Data-mining and predictive analytical techniques are firmly established within other marketing disciplines. Perhaps their time is now coming in the online world.

Predictive Analytics Part 1

This article, written by Neil Mason, was originally published on Clickz.com and is republished here with permission.ClickZ logo

In my last article I outlined my belief that what we call ‘web analytics’ is becoming a more diverse and complex field. What we have traditionally considered to be web analytics has been the analysis of site behavioural data captured, processed and reported on by a proprietary system designed to do just that. But as the online channel evolves and becomes more complex , the tools to help us understand what’s happening must also evolve and become more complex. In some areas, such as in the case social media, this may mean the development of new tools. In other areas it may mean the application of old tools to this new channel.
One of the areas that we work in a great deal is in the use of data mining and predictive analytical techniques. I first got started in this area about 15 years ago when at ACNielsen using these types of methodologies to help clients to try and figure out which half of their advertising money they were wasting. I have a book on my bookshelf that was published 25 years ago on the use of model building techniques in marketing. So the techniques aren’t new but what is relatively new is the systematic use of these techniques in the online marketing space.

I think that there are some reasons for this. Historically our main concern has been on managing the vast volumes of data and wrestling out of the web analytics systems a few numbers that told us how well we were doing and that we could do something about. Also, in the past, the natural organic growth in the channel has meant that we have not been faced with the need to scramble for market share and to fully optimise our business processes. And to some extent, we have not been asking the right questions. This is now changing. We understand our few numbers and we want to know more. The online world is far more competitive and we are beginning to ask questions that go beyond the limits of our traditional analytical tool set. Questions like:

  • How do I understand the effects different marketing channels have on generating sales?
  • What does the purchase lifecycle look like over multiple visits and how can I optimise it?
  • How should I be segmenting my audience or customers, to improve the effectiveness of my marketing activity?

To answer these types of questions we are going to have to start to organise the data in different ways and we need to bring in some different tools. First of all we need to integrate our data so that we can see different aspects of the acquisition, conversion and retention processes in one place, Secondly we need to aggregate our data so that its focuses on the visitor or customer rather than the click or the visit. Thirdly we need to cut through the noise in the data using more sophisticated analytical techniques to get at the key insights. Let me give you an example of what I mean.

We all know that different types of people come to our websites for different reasons and to do different things. If I treat everyone the same, I am being sub-optimal in my decision making about how I allocate marketing funds and about how I manage the user experience. I need to segment my audience so that I can market to these different groups more effectively. However, I can’t do that on the basis on how they behave on the website alone, I need to also understand their demographics, their intentions, their aspirations and their opinions. So I need to integrate my hard core behavioural data with profiling and attitudinal data drawn from other data sources like surveys.

Next, I am interested in the behaviour of visitors over multiple visits rather than what they do in a single visit. So I need to aggregate the data so that I have a record of the behaviour of different visitors over a period of time. Also I probably need to summarise the data and create additional attributes which describe aspects of that behaviour over time such as number of visits made, number of conversions events, types of conversion events and so on.

Finally, I need to analyse the data to identify interesting and meaningful segments of visitors. In all likelihood I will probably have quite a large and noisy dataset where I won’t be able to see the forest for all the trees. Traditional querying and reporting techniques are unlikely to be an effective method of identifying the patterns, I need to use something that will find the patterns in the data for me. In this case I decide to use cluster analysis. The cluster analysis process looks for groups of visitors in the data, where the people within the groups have something in common but what they have in common is different from group to group. What I have to do then is interpret that data to understand what it is the visitor segments have been clustered on and decide whether these are meaningful and useful segments that I can do something with. This process may yield some surprising results and enable to think about the audience in a way that I had not previously thought of them before. I may find patterns and relationships in the data that I would never have found using traditional analysis techniques.

So using data mining and predictive analytical techniques will allow organisations to unlock more value from their data but it requires a different approach to managing your data, different tools and different skills. Next time I will look at another application of data mining and predictive analytics; to understand what are the important factors are that affect someone’s propensity to buy something during the purchase lifecycle.

Till then…

  1. Page 1 of 3
  2. Next