Tuesday, March 3, 2009

How many models is enough?

I recently missed a presentation by a data mining software vendor (due to my recent paternity break) but I've been reviewing my colleagues notes and vendor presentation slides. I won't name the vendor, you can probably work it out.

A significant part of the vendor solution is the ability to manage many, we're talking hundreds, of data mining models (predictive, clustering etc).

In my group we do not have many data mining models, maybe a dozen, that we run on a weekly or monthly basis. Each model is quite comprehensive and will score the entire customer base (or near to it) for a specific outcome (churn, up-sell, cross-sell, acquisition, inactivity, credit risk, etc). We can subsequently select sub-populations from the customer base for targetted communications based upon the score or outcome of any single or a combination of models, or any criteria take from customer information.

I'm not entirely sure why you would want hundreds of models in a Telco (or similar) space. Any selection criteria applied to specific customers (say, by age, or gender, or state, or spend) before modeling will ofcourse force a baised sample that feeds into the model and affects its inherant nature. Once this type of selective sampling is performed you can't easily track the corresponding model over time *if* the sampled sub-population ever changes (which is likely because people do get older, move house, or change spend etc). For this reason I can't understand why someone would want or have many models. It makes perfect sense in Retail (for example a model for each product or associations rules for product recommendations), but not many models that apply to sub-populations of your customer base.

Am I missing something here? If you are working with a few products or services and a large customer base why would you prefer many models over a few?

Comments please :)

4 comments:

Matthew said...

Tim,

For the kind of work your group does you won't have needed this yet. You have a couple of models that process huge numbers of customers looking for the ones who are very very likely to do something. You look for lift at the 1% level and forget the rest.

You know that thing a decision tree does when you've fed it a customer base that's not actually homogeneous? e.g. you've fed it both fixed and mobile customers and their behaviour is actually totally different. So the top of the tree is a split between fixed and mobile. And then you've got two trees masquerading as one tree, evaluated as if they're the same tree, pruned as if they're the same tree. Suboptimal.

So that's a case for splitting. And you obviously already split fixed and mobile -- but that other group downstairs doesn't. The crux of it is, we're not necessarily talking about splitting customers by age or gender, there are plenty of other splits that can identify subpopulations with distinctly different behaviour. And while you persist with those neural networks you may never know ;-)

How often do you assess the accuracy you get from the score everyone, take subsets approach? Like, if you selected only iPhone customers out of the churn run, how concordant with reality are the churn scores in that group?

I can think of other things that add to the model count. Segment attribution models, especially if you're doing product microsegments as well. Forecasting can inflate to hundreds of models easily, because it can make sense to forecast per product or per customer. Trigger/event-based modelling, if you decide to have separate models predicated on having already exhibited the trigger behaviour. One of the banks was running around with 50 trigger models at one stage.

-Matthew

Tim Manns said...

Hi Matthew,

Good points! You're right; we do split by fixed and mobile, also within mobile by prepaid and postpaid billing. Nice to take a step back and see an outside perspective :) Our sub-populations are defined by product, billing method or some business process.

We do sometimes use decision trees too :) and report on differences between groups (churn, usage, spend etc). As you correctly mention, we have a target outcome of usually something low like 1% out of millions of customers. If we split into sub-populations we will get sample size problems (not enough churners or responders to build an accurate model). We can report on these differences, but not build acceptable models.
-> Sample sizes! I think here lies one big difference and reason why some analysts can comfortably create sub-populations for modelling.

Regarding your question; for some models lets say we can usually get a churn lift 10plus (ie. in the top 1-5% we'd get ten times more churners that random base). Our models work well against the whole base, although we usually only focus on the top 1%-5% of our target (churn, up-sell, fraud, risk, etc)

Shane Butler said...

I agree that model management on this scale is probably not required by many companies. However, I can see this developing into a big thing into the future, and I expect companies will pay extra attention to the model lifecycle and better monitor for things like changing customer behaviours and model decay over time, etc.

PS. Regarding business applications of producing large volumes of models, see this Visa paper: Detecting Changes in Large Data Sets of Payments Cards Data: A Cast Study.

Allan Engelhardt said...

Matthew is of course spot on in his comment. One European opco we worked with had thousands of models that were all updated weekly using primarily Teradata and KXEN with a custom-built scheduling platform. I wouldn't necessarily recommend that particular architecture, but it did work for them.

You have at least three classes of customers you want to model separately: prepay, post-pay, and small business. They have maybe 30 different price plans that are important and sufficient different that they warrant different models. Handsets are classified on capabilities: colour screen, internet browser, games platform, email capability, and so on, leading to maybe another 20 groups.

So without even thinking about the problem too much we have 1,800 different customer groups we reasonably may want separate models for. And why not: a model is fast to create and easy to update.

Then of course you have the targets. 29 price plans you might want to move the customer to (but probably only a few adjacent ones are really relevant). There may be 100 products from the obvious to things like getting the customer to personalize his/her voice mail greeting (lot of revenues on that one!)

Once you’ve done all the combinations, it adds up. Why would you? Why wouldn't you? Why would there ever be any measurable outcome that is important to your business which you would not want to model?

Of course it depends on how you count the models, and it is in the vendors interest to throw around big numbers.

I have been speaking a lot to Financial Services recently about how to integrate Marketing and Risk models and manage them as a whole. This is a big topic for them, and a big headache. How do you ensure that the models work together, and not against each other. In real-time, with the customer on the phone or on the web site. Model and rules management, what I called Business Intelligence for Rules and Models in a presentation at a recent BI conference, is a big topic.

Love your blog!

Allan.

References:

[1] Some general marketing speak on the telco and KXEN at Vodafone Germany Uses KXEN

[2] My rules management presentation is available from Oracle BI Symposium under the PCA Group logo.