Segmentation

  1. Page 2 of 2
  2. Previous

Predictive Analytics Part 1

This article, written by Neil Mason, was originally published on Clickz.com and is republished here with permission.ClickZ logo

In my last article I outlined my belief that what we call ‘web analytics’ is becoming a more diverse and complex field. What we have traditionally considered to be web analytics has been the analysis of site behavioural data captured, processed and reported on by a proprietary system designed to do just that. But as the online channel evolves and becomes more complex , the tools to help us understand what’s happening must also evolve and become more complex. In some areas, such as in the case social media, this may mean the development of new tools. In other areas it may mean the application of old tools to this new channel.
One of the areas that we work in a great deal is in the use of data mining and predictive analytical techniques. I first got started in this area about 15 years ago when at ACNielsen using these types of methodologies to help clients to try and figure out which half of their advertising money they were wasting. I have a book on my bookshelf that was published 25 years ago on the use of model building techniques in marketing. So the techniques aren’t new but what is relatively new is the systematic use of these techniques in the online marketing space.

I think that there are some reasons for this. Historically our main concern has been on managing the vast volumes of data and wrestling out of the web analytics systems a few numbers that told us how well we were doing and that we could do something about. Also, in the past, the natural organic growth in the channel has meant that we have not been faced with the need to scramble for market share and to fully optimise our business processes. And to some extent, we have not been asking the right questions. This is now changing. We understand our few numbers and we want to know more. The online world is far more competitive and we are beginning to ask questions that go beyond the limits of our traditional analytical tool set. Questions like:

  • How do I understand the effects different marketing channels have on generating sales?
  • What does the purchase lifecycle look like over multiple visits and how can I optimise it?
  • How should I be segmenting my audience or customers, to improve the effectiveness of my marketing activity?

To answer these types of questions we are going to have to start to organise the data in different ways and we need to bring in some different tools. First of all we need to integrate our data so that we can see different aspects of the acquisition, conversion and retention processes in one place, Secondly we need to aggregate our data so that its focuses on the visitor or customer rather than the click or the visit. Thirdly we need to cut through the noise in the data using more sophisticated analytical techniques to get at the key insights. Let me give you an example of what I mean.

We all know that different types of people come to our websites for different reasons and to do different things. If I treat everyone the same, I am being sub-optimal in my decision making about how I allocate marketing funds and about how I manage the user experience. I need to segment my audience so that I can market to these different groups more effectively. However, I can’t do that on the basis on how they behave on the website alone, I need to also understand their demographics, their intentions, their aspirations and their opinions. So I need to integrate my hard core behavioural data with profiling and attitudinal data drawn from other data sources like surveys.

Next, I am interested in the behaviour of visitors over multiple visits rather than what they do in a single visit. So I need to aggregate the data so that I have a record of the behaviour of different visitors over a period of time. Also I probably need to summarise the data and create additional attributes which describe aspects of that behaviour over time such as number of visits made, number of conversions events, types of conversion events and so on.

Finally, I need to analyse the data to identify interesting and meaningful segments of visitors. In all likelihood I will probably have quite a large and noisy dataset where I won’t be able to see the forest for all the trees. Traditional querying and reporting techniques are unlikely to be an effective method of identifying the patterns, I need to use something that will find the patterns in the data for me. In this case I decide to use cluster analysis. The cluster analysis process looks for groups of visitors in the data, where the people within the groups have something in common but what they have in common is different from group to group. What I have to do then is interpret that data to understand what it is the visitor segments have been clustered on and decide whether these are meaningful and useful segments that I can do something with. This process may yield some surprising results and enable to think about the audience in a way that I had not previously thought of them before. I may find patterns and relationships in the data that I would never have found using traditional analysis techniques.

So using data mining and predictive analytical techniques will allow organisations to unlock more value from their data but it requires a different approach to managing your data, different tools and different skills. Next time I will look at another application of data mining and predictive analytics; to understand what are the important factors are that affect someone’s propensity to buy something during the purchase lifecycle.

Till then…

Customer loyalty management

This article, written by Neil Mason, was originally published on Clickz.com and is republished here with permission.ClickZ logo

Last time in this series I looked at a number of different ways you might think about and measure customer loyalty. My view was that it’s not realistic to think about and measure customer loyalty as if it is a single entity but to create a loyalty measurement dashboard consisting of a number of appropriate and relevant indicators. These indicators might be behavioural, attitudinal or financial. To do this you will need to look at number of different data sources such as your web analytics data, surveys and other customer feedback data and any market or context data that may be available.

Following on from the tricky issue of looking to measure customer loyalty comes the issue of what to do about it. If you can look at the different aspects of customer loyalty through different metrics, then the question is: was do you do with this information? How do you act on it in a way that positively impacts on customers’ loyalty? How can you accelerate the building of loyalty when it’s in its ascendancy and how can you manage it when it’s beginning to decline?

On my customer loyalty dashboard I’m going to have a mixture of metrics. Some of them are going to be more strategic in nature, potentially even Key Performance Indicators (for example, a customer satisfaction index) and some of them are going to be more operational or tactical (such as recency or frequency measures). The strategic measures are going to be telling me how I am doing over the longer haul and the tactical measures are telling me what I need to do in the shorter term. The tactical measures are more likely to be behavioural metrics as, generally speaking, it’s easier to observe, react to and influence customer behaviour than customer attitudes.

RFM (Recency, Frequency, Monetary Value) analysis is often classically used to manage retention programmes. Customers are segmented according to how recently they have transacted, how frequently they have transacted and their value to the business. These segments can form the basis of differentiated retention and communication programmes depending on which segment the customer sites in. Customers who are in the top segment for recency, frequency and monetary value display loyal behaviour and are the ones that you don’t want to loose, and will probably deserve some special treatment.

A particular case of the RFM approach I think is the new customer, ie the customer who has just transacted for the first time. They’re a special case. It’s possible or even probable that you may not have made any money on them, you need to get them to transact again before you start to recoup your marketing costs. They are also at the steepest point on the “friction curve” which is the amount of effort required to get them to transact again. Retention is like momentum, once you get them started it’s easier to keep them going. In the case of the new customer, if you can get them to transact again, then they are more likely to transact a third time, and then a fourth and so on. So, customer retention, like conversion, is not one process but it’s a series of mini-events designed to move a customer from one state to the next.

The key advantage of RFM is its simplicity. It’s easy to do the analysis, create the segments and put together some specific customer communication. However, there are a couple of issues with it in my opinion. First of all, it’s assumes that people that behave the same on these dimensions will respond the same to particular communications. On it’s own it doesn’t help with the crafting of the retention marketing message. If you think of a multi-category retailer for example, different types of people will be buying different types of products. They may have similar shopping profiles but interested in completely different things. So, as well as knowing when to intervene, it’s also important to know how to intervene – what’s the trigger going to be?

The other issue is around recency. If you have a regular interaction in some way with your customers then by the time that you notice they’ve not been around for a while it may be too late. By the time they cancel the service, or stop visiting the site or whatever it is that means that they have stopped doing business with you, they could already be a lost cause. They might have stopped being attitudinally loyal some time earlier but it has taken a time to get to the point of being behaviourally disloyal.

So, we need to be able to anticipate changes in customer loyalty rather than just react to them. In many cases ,customers can give off signals or clues that their loyalty is shifting for the worse. They may change their patterns of behaviour, they may start calling customer services more often, and they may stop returning your calls. These are all indicators that changes are happening.

The role of predictive analytics in customer retention marketing is to give the marketer a heads up warning that something might be up with a customer. Predictive models look to identify customers who may be at risk based on the changes in other data. With all predictive models they will never be 100% accurate but if they are good enough they can at least reduce the risk of customers taking their business elsewhere. The inputs that go into these models will of course be specific to the individual business and the data that is available.

So, as markets become more competitive and retention becomes a more important facet of the digital marketer’s job description, it’s time to start thinking about customer loyalty seriously. What does loyalty mean in your business? Does it mean anything at all? If it does, how are you going to know if you’ve got it? What are the relevant measures? How can you impact those measures positively?

Lot’s of questions but they’re not necessarily difficult ones. The key thing I believe is to think them through carefully and build your customer loyalty dashboard accordingly. As the saying goes “Be careful what you measure, because what you measure is what you will get”.

How to do Predictive Analytics – Part 4

This post originally appeared on Applied Insights’ blog. Foviance acquired Applied Insights in November 2008, with Neil Mason joining us as Director of Analytical Consulting. As part of this acquisition, we’ve incorporated Applied Insights’ blog into our own.

Step 3 – Data Preparation

Anyone who has ever analysed data knows what a nuisance it can be. Whenever we want to analyse it in a new style we often have to manipulate in some way before we can do so. The more “raw” the data, or the more fundamentally different the analysis, the more work we typically have to do to get into the shape we need for the analysis we want to perform.

As I mentioned in an earlier blog; if the primary data source is a data warehouse which contains well structured, rigorously cleaned and de-duped data, then this is usually the best starting point. But it is only that. The shape of the data tables in the warehouse will inevitably have been defined with a certain type of analysis in mind; most often to produce the standard business intelligence style of reports. You might get lucky and find you can use that data as-is for the kinds of predictive analysis you have in mind. The chances are that you won’t, and that you will have to re-structure the data in preparation for that analysis.
Furthermore it may well contain aggregated data, perhaps an OLAP structure of some sorts, which may allow you to produce time series forecasts but which will most likely contain data which is too summarised for most other kinds of predictive analysis. If this is the case then you’ll probably need to go back and locate the sources of the summary data. That might not be a trivial exercise.

How did we get here?

In the previous steps, discussed in other blog entries, we effectively designed the analyses which we intend to perform at the next step; Data Modelling. In Data Understanding we learnt all about the existing data structures, formats and sources and we started to look for patterns in those sources which are pertinent to the analytical objectives we defined at the start of the process. The truth is that, to perform the exploration, we would have had to prepare the data to some extent. But this is the point where we get serious and apply the necessary data management steps to get the data into the shape(s) required for the main task; predictive modelling.

At the top level this means we end up doing one or more of the following:

  • Cleaning data
    This may not be necessary depending on how “clean” the original source is (though it is not unusual to find data problems when we start to analyse it in an unfamiliar way). Our previous exploration should have revealed any errors, or inconsistencies, which need to be corrected, or excluded.
  • Merging data from multiple sources
    If you are lucky the data will be in a single data file, or a single table in a database. If you are unlucky it will be in a variety of disparate sources with different formats in various locations
  • Shaping it for the analysis
    Often the most time consuming element. A classic example is where we have data with a sequence to it; typical if we are looking to predict the likelihood of a an event given a set of previous events. The starting point is typically data in a database which often contains all event transactions. In order to model it in a way which mimics how we will look to apply (deploy) the model we have to define an appropriate point in history as the baseline, e.g. if we are interested to know what will happen after March 2007 we might use March 2006 as that anchor point. We then have to restructure the incoming data to derive all the interesting predictors e.g. transaction frequency, transaction value in previous months, years, etc. from March 2006 backwards. We also need to have a separate data partition which contains the “what happened next” data for a period after March 2006 that corresponds to the period we want to predict into in 2007; so if we are interested to see which customers are likely to churn in April 2007, then April 2006 is likely to be the best month to look at it 2006. NB. Modelling and Evaluation (see later) will help test that hypothesis.
  • Deriving new data elements
    Typically new fields(variables). In our exploration, for example, we may have found that there appears to be a strong relationship between the rate at which a customer buys products and the likelihood that they will churn. In many cases that rate will not exist as a separate measure in the current data, so we create it in this step.
  • Describing it
    Labelling, formatting and generally documenting the data in a way which helps the analyst, or other viewers of the data, to understand its meaning.
    The outcome of the above is a set of tables, or data files, which are in the shape we believe we need for the modelling effort we have in mind.

You [almost always] never get it right first time

We’ve mentioned it before but it is worth re-stating that much of the CRISP process is iterative. Quite often we will get into the modelling step, for example, and discover a potential relationship that looks interesting but which we have to go back to the preparation process to derive. Frequently, because we are often building complex data handling processes from scratch, we just make mistakes which need to be corrected.

With large datasets the preparation time can be significant; It can take hours sometimes days, so mistakes and re-runs can be costly. Hence wherever possible it is a good idea to test the process using data subsets, ideally random, or at least representative, samples. Samples can also be used to boost productivity when we get into the analysis – more on that next time.

An example

Data collected in the web channel is a great illustration of this point. We work with a lot of this kind of data typically for web sites with large numbers of visitors; usually millions per week. These sites inevitably have a web analytics tool which they use to analyse key metrics of site performance. Most often we are interested to apply predictive and/or segmentation methods to the site data. This typically involves:

  • Extracting behavioural data from the data warehouse (underlying the web analytics tool) or via a data feed that the analytics vendor provides. More often than not we extract this data to a number of text format files.
  • For our Customer Journey Framework we usually have an additional data source in the form of on-line surveys. Depending on the analytics tool that the client is using we have developed a number of ways of linking the data that the visitor provides as a respondent in the survey to the behavioural data which maps that visitors journey through the site.
  • The data we end up with can be at various levels but more often than not it is at the individual page or individual click level (remember these sites have millions of visitors so the number of records gets multiplied up). We take this data and aggregate it over a period of time to end up with tables for analysis which are at the visit and/or visitor level. Each of the resulting records will contain fields of interest; e.g. site content viewed, visit intentions and conversion goals which we will use for analysis.

For a typical site processing a weeks worth of data into the shape needed for analysis can take 4-6 hours.

Which tools?

As is often the case the choice of tool for data management comes down to those that the analyst/data is familiar with. Database tools are all about this type of work and often the best approach is to aim to construct data mining tables inside a relational database. This can be achieved using a combination of SQL, ETL tools and other database utilities.

Generally speaking; the more sophisticated the predictive tool itself the greater the data management capabilities which are built in. So SPSS, SPSS Clementine, SAS and SAS Enterprise Miner offer a broad range of data handling procedures.

So much for progress

Even though we have more and better tools, and faster hardware, with which to manipulate data these days this is offset by the increasing volume of data, complexity of structures and number of sources. Hence the old adage that data management consumes more of a data analysis effort than the analysis itself typically holds as much today as it ever did. But it is a necessary pain to get us to the point where we can get to the next step which is at the core of the predictive process; Data Modelling.

The Analyst’s Toolbox: Segmentation (2)

This article, written by Neil Mason, was originally published on Clickz.com and is republished here with permission.ClickZ logo

In my last article I started to take a look at the subject of segmentation. Something that is all the rage at the moment in the wacky world of web analytics. I outlined two key considerations in segmentation analyses; the approach you use to segment and the basis on which you segment. The approaches you use to segment may be deterministic or discovery based and I looked at those in the last article. This time we’re going to take a look at the basis on which you might segment your website visitors or customers.

There are three ways that you might choose to segment your users or customers. You might segment them on the basis of their demographics, their behaviour or their attitudes. Attitudinal segmentation is where people are segmented according to what they think about the brand or related issues. Often these attitudinal segments are developed from market research techniques and can be useful for brand development work. However, attitudinal segments are difficult to apply directly back into outbound marketing programmes. How do you recognise a “brand advocate” when they arrive on the site or from what they download or buy?

Demographics such as gender, age, household composition, income and the like may be a useful way to segment customers and visitors. Other lifestyle or geo-demographic classifications such as Mosaic in the UK may add a richer dimension to standard demographic segmentation. Demographic segments may be useful understanding the difference in browsing or shopping behaviour between men and women or between the young or the old. For business to business activities, demographic segmentation translates into business sector classification. This could be through industry standard definitions such as SIC codes or your own customer definitions such as SMEs, strategic accounts etc.

Whilst demographic segmentation can be interesting and sometimes quite useful it does assume similarities in underlying behaviour within the different segments. Do all 18 to 34 men think and behave the same when it comes to buying/using products and services? Probably not.

We often find that some of the most useful approaches to segmenting users and customers is by looking at their behaviour rather than just who they are. Behaviour can be easily observed either in your web analytics tool or in your customer database and those observations can then be used for the basis of outbound marketing programmes.

One of the classic behavioural segmentation approaches is Recency, Frequency, Monetary (RFM) analysis. RFM analysis is a deterministic approach where customers are dividing into segments on the basis of how recently they transacted, how frequently they have transacted in the past and what the value those transactions have been. Typically there are up 5 segments ranked from 1 (low) to 5 (high) on each dimension giving 125 segments in total. The top segment (number 5 on each dimension) is your most valuable customers, they transact a lot, they spend a lot and they have done it recently. They are the ones you don’t want to loose. For more information on RFM analysis you should check out Jim Novo’s site at www.jimnovo.com.

Whilst RFM analysis allows you to segment your customers based on their transactional behaviour in aggregate; it doesn’t give a perspective on what people are buying, downloading, reading and so on. Other segmentation approaches are based upon an analysis of what it that people are buying over time and look for commonalities and patterns in this behaviour. Are there groups of people that tend to buy the same sorts of products?

Discovery based techniques such as cluster analysis (which I discussed last time) are used for this type of segmentation approach. The purchasing behaviour of the individual customers is run through the algorithms to create distinct segments of customers with similar purchasing profiles. The segments are then profiled to understand what those commonalties are in the purchasing behaviour. These behavioural segments should then also be profiled with other data such as demographic and altitudinal data if possible. This additional data may come from the database, lifestyle profiling data or from survey work.

These purchasing segments can be used to improve the effectiveness of direct marketing programmes by adding insight into the type of message that might be relevant for each individual. We had an example of how it can add to a classic RFM approach in a project we worked on with a retail client. The client was using shopping behaviour as a way of segmenting their customer base using measures such as average order value, number of orders in the last year and so on. Using a product purchasing approach we identified two segments with very similar shopping behaviour but who were buying completely different products. Further profiling work showed they also had very different demographic profiles. One segment was older men and the other was younger females. The opportunity was now there to target and communicate to these two different groups in a much more relevant way.

So, segmentation encompasses a wide range of analytical approaches and techniques from the simple to the more complex. The trick is to start gently and build up your understanding of your customers by gradually breaking them down into meaningful and actionable segments, giving a sharper edge to your marketing communications.

The Analyst’s Toolbox: Segmentation (1)

This article, written by Neil Mason, was originally published on Clickz.com and is republished here with permission.ClickZ logo

Segmentation. There’s a word. It’s a word that quite often means different things to different people and it’s all the rage in web analytics. Everybody is doing segmentation; all the web analytic tools are offering segmentation. But what it is it what does it mean and how can it be used? In simplest terms segmentation is the process of dividing a group into sub-groups. The idea in marketing segmentation is that there are some meaningful differences between the sub-groups which can be useful for marketing purposes.

I think that there are probably two main things to think about when doing segmentation. The approach you use to segment and the basis upon which you segment. There are two main approaches to segmentation I believe:

  • Deterministic approaches
  • Discovery approaches

The basis on which you segment might be along the lines of:

  • Demographics and lifestyle
  • Behaviour
  • Attitudes

Deterministic approaches are where you create your segments based on some pre-defined or pre-determined classification. It might be a relatively simple classification like “Male vs Female” or they may be more complex like “First time visitors with abandoned shopping carts containing yellow socks”. With deterministic approaches you have some hypothesis that the segment is interesting, important or valuable and you maybe then test that hypothesis. Most web analytic tools now offer what I call this deterministic approach to segmentation. They offer the ability (to varying degrees) to divide or extract visitors into different groups and run reports comparing different groups against each other. In addition you may be able to extract email lists and other details from the segments for outbound marketing purposes.

The ability to segment and analyse different sub-groups of the visitor base is increasingly important. You can’t continue to run the site as a “one size fits all” business. Deterministic approaches are useful to try and identify some meaningful differences or to understand underlying behaviour in more detail. However, you have to go hunting and you may not always go hunting in the right direction. This is where discovery based techniques can come in to play.

By discovery based techniques I mean statistical and other data mining techniques such as cluster analysis and neural networks. Having spent some time in the past in the market research industry, I often think of the use of these techniques when talking about segmentation. Cluster analysis is a statistical technique that segments the population into sub-groups that display some commonality. There are many different cluster analysis algorithms that vary in their application and complexity. The overall objective of cluster analysis though remains the same: to maximise the similarity of the members within each of the sub-groups and to also maximise the differences between the sub-groups. In other words, you want each member of the sub-group to look as similar to each other as possible (all part of the same club) and for each sub-group to have distinct and meaningful differences from each other (all the clubs are different).

Cluster analysis is an iterative statistical process and therein lies the rub. The statistical process can create segments that are distinct but it doesn’t necessarily result in segments that are meaningful! So, the use of these types of techniques is as much an art as it is a science. Just because the analysis software reaches a result that is statistically correct, it’s not necessarily a useful result and these techniques also are dependent on the data that you start with. As the old saying goes “Garbage in, garbage out”.

Neural networks are a more “black box” kind of technique, based on the way that the brain works. They use artificial intelligence algorithms, such as Kohonen Networks, to find relationships or patterns in data. With classical statistical analysis techniques such as cluster analysis, the analyst has more control over the analysis process and can more easily interpret the findings and the outputs. Data mining techniques such as neural networks can be more powerful but also can be more difficult to handle (bit like driving a Ferrari or so I imagine).

In either case, getting some segments out is only half the battle. The other half is about understanding what they mean and what can be done with them. Typically the output of a cluster analysis will tell you that this person belongs to this segment. You then have to work out what it is that characterises the individual segments and what the differences are between the various segments. This is the profiling stage. The way the segments are constructed will be based upon the data that goes into the analysis. So if you use some behavioural data to create the segments, then the differences will be based on those behaviours and that’s the first place to look. However, you will also usually want to pull in other data to help explain what the segments mean. This can be demographic or attitudinal data for example.

So, segmentation can mean different things to different people, from simple classification through to more complex pattern discovery approaches. In this article we’ve looked at some of the approaches and techniques that can be used. In the next article in this series I will take a look at the different types of data that you may want to segment on and how they may be useful to the internet marketer. Till then…

The Analyst’s Toolbox: Introduction

This article, written by Neil Mason, was originally published on Clickz.com and is republished here with permission.ClickZ logo

There is a tendency when we talk about analysing web data that we focus on the use of so-called web analytics tools such as Google Analytics, Omniture, Coremetrics and the like. These analysis tools were developed specifically to manage the challenges of managing the reporting and analysis of data collected from web sites but they aren’t necessarily the only tools we might have in our toolbox.

There are a variety of other reporting and analysis tools that we might want to use on the data from our web sites to get a better understanding of online business performance and customer behaviour. It is fair to say that web analytic systems have significantly improved their analytic capabilities over the past few years and will no doubt continue to do so. These days there is a far greater ability in a number of the systems to be able to filter and segment data on the fly to look at the behaviour or characteristics of particular groups.

However, as the needs of the organisation continue to develop so too might the need for different or specialist reporting or analysis tools. Other systems for reporting and analysing web and customer data can be grouped into three broad categories:

  • Business Intelligence (BI) or OLAP tools
  • Visualisation tools
  • Statistical analysis and data mining tools

BI or OLAP tools are often found in the corporate reporting environment and this class of tools includes systems such as Business Objects, Microstrategy and Cognos. Databases such as Oracle and SQL Server also either come with BI functionality or it can be bolted on. Underpinning many of these tools is the concept of a data cube that allows the analysts to drill through the data in a hierarchical manner. In a commerce environment I might start looking at say at total sales for a year and then drill down into product categories, then into sub-categories and then down to the product level.

Some web analytics systems do have the ability to drill through data in this way but a feature of the family of BI tools is the ability to handle multiple hierarchies across multiple dimensions. So, in addition to being able to drill through on the product dimension, you can also drill through the data say in terms of geography and also time. BI tools could also be used to report on web data in the context of other channels, for example comparing the profile of leads or enquiries generated online against those generated in the call centre.

As the saying goes, a picture tells a thousand words and visualisation tools can be a valuable weapon in your analytical arsenal. Again some web analytical tools such as Visual Sciences and Site Intelligence have some powerful visualisation capabilities. Whilst many web analytics systems have improved the visual reporting of web data through developments of click overlays for example, for the analyst a visualisation tool might add another dimension.

Visualisation tools can range from add-ins or add-ons for Excel through to complex applications that are commonly integrated in with data mining tools. At the desktop level, Excel add-ons such as MM4XL extend the scope of the charting abilities of Excel and allow the analyst to present data in different ways. More sophisticated tools can produce three dimensional rotating images that allow the analyst to explore and look for patterns in the data. The human brain is still one of the most powerful tools available for spotting patterns and trends in data when presented in the right way!

The final set of tools that might be useful for analysing web and customer data are statistical analysis and data mining tools. What’s the difference between statistical analysis and data mining? The way that I tend to view it is that statistical analysis is predominantly about exploration and data mining is about discovery. With statistical analysis you are often looking to test an assumption or a hypothesis. For example, you may be looking to prove that one group of customers rate your product or service more highly than others. With data mining, you are looking for patterns or relationships in the data that you may not know about.

Statistical analysis and data mining covers a wide variety of approaches, methodologies and techniques that might be useful for the web analyst. The can be broadly be classified as follows:

  • Statistical analysis
  • Classification techniques
  • Clustering and segmentation methodologies
  • Forecasting
  • Text analysis

Increasingly many of these techniques are being used for making predictions and so the phrase “predictive analytics” is a term that is often used as well to describe these various methodologies.

Some of this stuff may seem like a long way from the current day to day analysis of conversion funnels and the like. But as the market continues to mature and growth comes from optimisation and improvements in marketing efficiencies, some of these techniques will have a place on the analyst’s workbench. Over the next couple of weeks, I will take a look at some these techniques in more detail and how they be used in the context of analysing online visitor and customer behaviour.

Understanding key customer journeys

This article, written by Neil Mason, was originally published on Clickz.com and is republished here with permission.ClickZ logo

Over the past few weeks I have been taking a look at the variety of data sources available for evaluating e-business performance in addition to the data that comes from your site. These additional sources include audience panels, surveys and focus groups. I’ve also been making the point that purely focussing on web analytics data rarely gives you the full picture.

To talk about this on a practical level let’s take a look at how multiple data sources can be used together to look at a specific business issue such as optimising conversion rates on the site. The simple premise is that if you know who is coming to your site, why they are there and what they are trying to do then you can develop the site to optimise these key customer journeys.

To help digital property owners understand how visitors are interacting with their site we use something called a Customer Journey Framework. This framework is an approach to understanding which visitors are trying to use the site, how they are using it and whether they are being successful in their goals or not. There isn’t a single source of data that will give you the answer to these questions. You need to draw the answers from a suite of different places.

The Customer Journey Framework comprises of three key components:

  • Understanding the different types of visitors (Audience segments)
  • Understanding why people visit the site (Intentions)
  • Understanding usage of the site and the consumption of different content (Content)

Different people come to your site for different reasons and there are bound to be different segments of visitors. The challenge is to work out what the most meaningful segments are for your business that you can use for your marketing and site development activities. This is something we’ll take a look at in the future.

Working out who is coming to your site is where you might use audience panel data, surveys or internal data from customer of registration databases. The reality is that you might need to use all three to build up a true profile of the different types of users that you might have. Audience panels can give you a demographic profile (if your site is large enough) but they may not help you to segment your audience in a meaningful way.

Surveys can help you understand if different types of visitors are coming to your site for different reasons. We call these “intention modes”. What is the visitor’s intention when they arrive on the site? What is their goal? To use an e-commerce example, a visitor may come to a site in one of these modes:

  • To browse for something and buy it they find something they like
  • To do research for price comparison purposes
  • To buy a specific product that they have already researched
  • To browse around with no intention of buying anything

Visitors in each of these modes will have different goals and will also exhibit different behaviours on the site.

By linking intentions to visitor segments you may find that some modes are more pronounced in certain groups of visitors. For example, in some work for a high street retailer in the UK we found distinct differences in these modes were evident when we looked at it along age and gender lines. In this particular case, older females were tending to arrive at the site with higher levels of purchase intent than younger females. The younger females were looking to be inspired by the site to make a purchase whereas the older females were more likely to already have in their mind what they wanted to buy.

The final link is then to layer these visitor segments and their modes onto the actual content of the site. This is where web analytics data is important and the linking the behaviours that you see on the site back to what you know about visitors and what they are trying to do. So, in our example above are the younger females looking at different types of products than the older females and so do those products need to be merchandised differently on the site to maximise conversion?

Linking behavioural data and profiling data can be tricky. It’s certainly easier if you can identify at least some of the site’s visitors through say a registration process or a transaction. You can match the profiling data captured in the process with the actual behaviour on the site and use that information to generalise for all traffic. It is also possible to link survey response and site behaviour data as well, though certainly here in Europe you need to be mindful of privacy concerns about identifying individuals.

The framework we’ve looked at here is one example of bringing together data from different sources to get a holistic view of what is happening on the site. It’s also just that – a framework, which can be adapted to suit the circumstances of your own site and information sources.

A segmentation primer

This article, written by Neil Mason, was originally published on Clickz.com and is republished here with permission.ClickZ logo

One of the things that you hear being talked about a lot more about these days in the wacky world of web analytics is “segmentation”. But I sometimes wonder what people mean when they talk about segmentation. I think it’s one of those words that is used more often than it is necessarily understood. Understood in the marketing sense of the word anyway.

I’ll take one example. One of the largest and most successful web analytics systems vendors has a section in their report menu called “Segmentation”. What we actually find there are reports on the most popular pages and sections of the site. I’m not too sure what that has to do with segmentation. Other vendors talk about segmentation as well but mean different things. Sometimes they talk about the ability to filter along different dimensions or the ability to analyse the data by combining different variables. So, segmentation could mean reporting particular data, filtering data or analysing data. All of these things are good things, and potentially even useful things, but are they segmentation?

I dug out some of my marketing text books to see if there was a consensus view in them about what segmentation actually is. I found that what they tend to talk about is that segmentation is a means of identifying different groups of people in order to develop different strategies for each group. So, segmentation is a purpose rather than an outcome and I think that’s the difference between classification (which is what a lot of analysis tools do) and segmentation which is what marketers or marketing analysts do.

The point of segmentation is that you do something as a result of having it. For example:

  • You target different groups of people with different messages in your acquisition campaigns
  • You present a different site experience dependent on your understanding of who that person is
  • You interact with different people differently dependent on where they are in a customer lifecycle

In one of the books that I looked at that was actually written 20 years ago, the authors described three conditions of a good segmentation*. They are:

Homogeneity – the degree to which people in the segment are similar in ways that is interesting to you

Parsimony – the degree to which the segmentation would make every person a unique target

Accessibility – the degree to which you can describe the segments in ways that help you deploy differentiated marketing strategies

That all sounds pretty theoretical (well, it was a text book), so what does this mean in practice?

My interpretation of this is that a good segmentation has to be robust, useful and actionable. There are many ways that you might segment say a site’s visitors or your customer base from simple classification approaches through to complex statistical techniques but they have to pass the sense check of being robust, useful and actionable.

You might simply classify according to some demographic or geographic variables. For example classifying the customer base between male vs female is a form of segmentation but it is only robust and useful if men and women exhibits differences that are potentially useful to you and only actionable if you can realistically target them in different ways.

Alternatively, you might develop a segmentation based on some attitudinal variables. Many years ago I was involved in a project where we segmented the visitors across the number of different sites we had in Europe according to their attitudes to online shopping and their motivations for visiting the site. Whilst the results were certainly interesting and highlighted some interesting differences in the visitor profile of the different sites, we had to question how useful it was to us. How were we going to action the insight? We couldn’t identify and classify people arriving on the site by their attitudes nor could we easily use it in our retention marketing activities as we didn’t have people’s attitudes stored on our customer database.

So, I think that there is always a balancing act in satisfying those three conditions of homogeneity, parsimony and accessibility in a good segmentation. In our own work, we tend to use behavioural segmentation approaches as it makes it easier to act on the outcomes. This may often involve using statistical methods such as cluster analysis to segment customers into groups that are distinct from each other in a meaningful way like their browsing behaviour or their purchasing behaviour.

However, we are also mindful of the ability to the client to be able to act on the results. There is no point in developing a sophisticated methodology that identifies some really meaningful segments if there is neither the skills nor the tools available to realise the opportunity. For example if your email tool is not easily integrated into your customer database then it’s going to be difficult to execute improved target marketing initiatives. It is best to start with something simple and develop the capabilities to act in line with the development of the insight itself.

As it’s getting to that time of the year, in my next article I will be taking a personal look back at 2005 and reflecting of the some of the key events from my perspective and trying to get a sense of where we may be heading in 2006.

* “Marketing Decision Making – A model-building approach” by Gary Lilien and Philip Kotler

  1. Page 2 of 2
  2. Previous