I’ve practically been ‘blog-dead’ for a year, so I’ll try to
make this a good one. The reason for
this post is I’ve had a number of discussions about a retail analytics project I worked on earlier this year. Some things
seemed obvious to me, but in discussions with other analysts (and my internet
searches for examples of analytics in retail) what I implemented appears to be novel.
As an Analytics Consultant at SAS I’ve been working on
projects for many different clients, rarely the same industry, and often a
different problem each project. One of
the projects I worked on this year was for a major retail department
store. They sell clothes, home wear,
toys etc.
As part of a short 25 days proof-of-concept project I was
given three years of point-of-sale data, a little customer summary data (age,
gender, address), product descriptions, and instructed to and I quote, “do some customer analytics”.
All background information, project scope, expected
deliverables were unknown. Any
background about the extent of analytics currently being performed, or even
high level strategic direction was withheld.
Of course my first thought was “great, a greenfield! I can
do so many cool analytics deliverables and showcase the best stuff”. Then my second thought 5 seconds later was
“oh crap I have to deliver a ton of stuff showcasing everything in 25 days…”.
So my first challenge was to decide upon a list of
deliverables. Some deliverables are
obvious, market basket analysis for example is probably the first thing most
analysts will think of for the retail industry. Basket analysis will use the point-of-sale
data, which can also be used to understand customer spend, frequency of
purchase, and time since last purchase (RFM; recency, frequency,
monetary). Also, most organisations
utilise some form of third party geo-demographic data, but few use the data to
its fullest.
So I decided to present the deliverables as three main
sections;
·
Market and Customer Profiling
o Using national census data
o Store level summary analysis
o Sales data summarised to a customer level
o Customer Segmentation
o Using national census data
o Store level summary analysis
o Sales data summarised to a customer level
o Customer Segmentation
·
Predictive Analysis
o Time series forecast of spend for each customer
o Next best product
o Churn propensity
·
Market Basket Analysis And Product Analysiso Time series forecast of spend for each customer
o Next best product
o Churn propensity
o Price Elasticity
o Product Cannibalisation & Halo
o Price Sensitivity
Too much to cover in one post, but a few of the topics that
were discussed in conversations with my peers are covered below;
1)
Customer
Churn.
- Loss of expected revenue, not no. of customers churn
A common peculiarity in many organisations is the definition of churn is simply how many customers are lost. Sometimes analysis may include or segment churn by a spend category (high value vs low value etc), but rarely is the churn model actually predicting how much money is lost.
Furthermore, the behavioural/transactions within a retail organisation is less than other types of industries. For example a typical telecom in Australia would collect 200 million records a day, which may be more than a year for a typical leading retail organisation (in terms of behavioural/transactional records).
For an industry like telecommunications or banking there is almost a monthly contractual behaviour that can be fairly easily analysed. For example the number of calls made and received, or the monthly pay check going into your bank account and the purchase transactions that debit your account. These happen very frequently and often forced into cycles (week, month). Conversely, in retail you may have weeks or months between voluntary visits, and spend in any one visit may differ depending upon seasonality or customer. A customer churn definition in retail must recognise the flexible RFM nature of the customer. In the limited time I had available I didn’t look into approaches to analyse customers differently depending upon their RFM behaviour. In the data I was analysing the majority of customers visited at least quarterly, so to keep things simple I aggregated customer spend to a quarterly summary. I then forecasted spend individually for each customer for the subsequent three quarters using a structural time series modelling approach (unconstrained component model) with a hierarchy up to customer segment (in order to also provide forecast spend by customer segments).
The forecast of spend could then be used as a measure of the potential loss of revenue. If the forecast was lower than the customer’s previous average behaviour, then it suggest decreasing spend and potential ‘churn’. The difference between the customer’s forecast and rolling average could easily be used as a loss of expected revenue (aka. a churn score). Marketing activities then focus upon reducing forecasted revenue loss through changes in RFM behaviour of each customer.
2) Market Basket Analysis
- Representation of product and price used in basket analysis.
In a typical grocery retail environment a product can be on the shelves and sold to customers for many years. The product doesn’t change and simple marketing basket analysis can be very valuable, but what if the product will only exist for 3 months and never repeats? How are associations of new products predicted? What if products are always new?
In some retail environments a product may only be on the shelves for a few months and may never repeat. For example take the situation in a Kids Clothes dept, there may be a Marvel Avengers t-shirt with a specific stock keeping unit (SKU) code. Within a few months the stock is sold and the product is replaced with a new t-shirt, perhaps Ironman :)
In these situations where a product code is frequently changing, market basket analysis applied to directly to the products will have limited long-term applicability and success because any basket associations that existed last month or last year may not match any existing products. Solutions to this might involve either matching all new products to equivalent old ones, or the basket analysis itself could be applied to some form of representation of a product.
Without going into too many details (for fear of losing some intellectual property and giving away too many tricks...) I used a simple approach to group products based upon their price. Within each product/sku grouping (for example, Kids T-Shirts) there will be many different product codes that change over time. Some products will be priced at the bottom scale, others priced the highest for that product category (ignoring discounts). Any new product can be matched to the old simply by being in the same price range.
Ok, so now we have basket analysis that can be applied not upon products, but ‘low value product category A’ and ‘high value product category B’. This may seem like a subtle difference, but it is something I haven’t heard or seen retail vendors do yet. Even the leaders (Tesco, Walmart etc). Marketing basket associations are often reported as “Customer that bought Milk also bought Bread”. Honestly, my first thoughts in response when I hear this are “so the f##k what...”. It is one of those common examples of analysis that doesn’t shout out business value. Simply having an association doesn’t help the retail organisation better manage the internal politics of cross department promotions, nor does it help them even understand if sales of a low priced or promoted item associates with sales of other low or high priced items for example.
In short, my point is I built associations that differentiated by price and products, rather than just associate products.
Doing this I enabled basket associations that, for example, highlighted the increase in $ sales that occur in kids footwear (full price) whenever there is a promotion (cheaper price) in Kids Clothes.
One specific product (Children’s Wear -> Girls 7-10 -> Pleated Dress).
Product price varies from $6.01 to $22.40. Average (mean) price of $15.40
- Loss of expected revenue, not no. of customers churn
A common peculiarity in many organisations is the definition of churn is simply how many customers are lost. Sometimes analysis may include or segment churn by a spend category (high value vs low value etc), but rarely is the churn model actually predicting how much money is lost.
Furthermore, the behavioural/transactions within a retail organisation is less than other types of industries. For example a typical telecom in Australia would collect 200 million records a day, which may be more than a year for a typical leading retail organisation (in terms of behavioural/transactional records).
For an industry like telecommunications or banking there is almost a monthly contractual behaviour that can be fairly easily analysed. For example the number of calls made and received, or the monthly pay check going into your bank account and the purchase transactions that debit your account. These happen very frequently and often forced into cycles (week, month). Conversely, in retail you may have weeks or months between voluntary visits, and spend in any one visit may differ depending upon seasonality or customer. A customer churn definition in retail must recognise the flexible RFM nature of the customer. In the limited time I had available I didn’t look into approaches to analyse customers differently depending upon their RFM behaviour. In the data I was analysing the majority of customers visited at least quarterly, so to keep things simple I aggregated customer spend to a quarterly summary. I then forecasted spend individually for each customer for the subsequent three quarters using a structural time series modelling approach (unconstrained component model) with a hierarchy up to customer segment (in order to also provide forecast spend by customer segments).
The forecast of spend could then be used as a measure of the potential loss of revenue. If the forecast was lower than the customer’s previous average behaviour, then it suggest decreasing spend and potential ‘churn’. The difference between the customer’s forecast and rolling average could easily be used as a loss of expected revenue (aka. a churn score). Marketing activities then focus upon reducing forecasted revenue loss through changes in RFM behaviour of each customer.
2) Market Basket Analysis
- Representation of product and price used in basket analysis.
In a typical grocery retail environment a product can be on the shelves and sold to customers for many years. The product doesn’t change and simple marketing basket analysis can be very valuable, but what if the product will only exist for 3 months and never repeats? How are associations of new products predicted? What if products are always new?
In some retail environments a product may only be on the shelves for a few months and may never repeat. For example take the situation in a Kids Clothes dept, there may be a Marvel Avengers t-shirt with a specific stock keeping unit (SKU) code. Within a few months the stock is sold and the product is replaced with a new t-shirt, perhaps Ironman :)
In these situations where a product code is frequently changing, market basket analysis applied to directly to the products will have limited long-term applicability and success because any basket associations that existed last month or last year may not match any existing products. Solutions to this might involve either matching all new products to equivalent old ones, or the basket analysis itself could be applied to some form of representation of a product.
Without going into too many details (for fear of losing some intellectual property and giving away too many tricks...) I used a simple approach to group products based upon their price. Within each product/sku grouping (for example, Kids T-Shirts) there will be many different product codes that change over time. Some products will be priced at the bottom scale, others priced the highest for that product category (ignoring discounts). Any new product can be matched to the old simply by being in the same price range.
Ok, so now we have basket analysis that can be applied not upon products, but ‘low value product category A’ and ‘high value product category B’. This may seem like a subtle difference, but it is something I haven’t heard or seen retail vendors do yet. Even the leaders (Tesco, Walmart etc). Marketing basket associations are often reported as “Customer that bought Milk also bought Bread”. Honestly, my first thoughts in response when I hear this are “so the f##k what...”. It is one of those common examples of analysis that doesn’t shout out business value. Simply having an association doesn’t help the retail organisation better manage the internal politics of cross department promotions, nor does it help them even understand if sales of a low priced or promoted item associates with sales of other low or high priced items for example.
In short, my point is I built associations that differentiated by price and products, rather than just associate products.
Doing this I enabled basket associations that, for example, highlighted the increase in $ sales that occur in kids footwear (full price) whenever there is a promotion (cheaper price) in Kids Clothes.
One specific product (Children’s Wear -> Girls 7-10 -> Pleated Dress).
Product price varies from $6.01 to $22.40. Average (mean) price of $15.40
It is a subtle difference, to perform market basket analysis upon products including the price. This enables analysis of associations that might only occur when one or more products are reduced in price. This can help a retail organisation then better design and optimise promotions.