May 20, 2008
How To

Special Report: How to Use Predictive Modeling to Pick Your Best Prospects & Boost ROI Up to 172%

SUMMARY: What if you could better predict which of your past customers are your best prospects to purchase again? You can with predictive modeling. See how you can use predictive modeling -- the Holy Grail of direct marketing -- to wrestle with and segment mounds of customer data.

Our latest Special Report includes:
- Modeling basics
- A mini-Case Study with a 172% ROI
- Vendors guide and useful links
Analytics are great at reporting what already happened. What if they could help predict the future --like which customers are most likely to buy from you again?

Say hello to predictive modeling -- analytics that can help you predict the future as forecast by mountains of your customer data. Modeling has been around for decades -- with a reputation for being powerful but complicated to apply. But software innovations are making it easier for marketers to put predictive analytics into play.

Modeling isn’t 100% exact, but it has helped marketers multiply their ROI. The process involves applying data-mining technology to your customer data to create a specialized model that gives every customer a probability score. The score predicts that a customer will take a certain action.

Some examples:
o Predict which customers are most likely respond to an offer
o Predict which customers are mostly likely to defect
o Predict if a customer is a big spender
o Predict if a customer will make repeat purchases

In this Special Report on predictive analytics, we explain what modeling is, how it can work for you, how it helped one marketer achieve a 172% higher ROI and where you can explore further to see if the tactic is right for you.

How to Create a Predictive Model
A predictive model determines the probability of a certain outcome based on a target -- what you want to predict. You use data-mining software to sift through your customer database.

Every category of customer information -- age or favorite color or buying frequency or how many times a customer visited your store in the past year -- is a variable collected as a predictor of future behavior. A predictor is your model’s central building block.

For example, you want to predict which customers will visit your store at least five times in the next 12 months. Here’s a simplified version of what you need to do:

-> Step #1. Prepare your data

Preparing data is the most difficult and complicated step in the process. We’ll talk about why and what you can do about it later.

“It’s estimated that 70% to 80% of the time devoted to an analytical project is devoted to data preparation. It’s just getting the data in the one place in the right form to actually start building models,” says Richard Hren, Director Product Marketing, SPSS.

->Step #2. Set your target

Your target is the customers who will visit your store five times in the next year. For this example, the target is the same as one of the variables -- customers who visited the store five times in the past year.

->Step #3. Determine the most important variables

Determine which variables are most relevant to your target. Some types of data mining software will dig through data and tell you. Other packages depend on your judgment to determine which variables matter most. Some software will do both: tell you what it likes and allow a statistician to tweak it.

->Step #4. Run program to get a model

The software weighs the importance of each variable and creates a model -- think of it as an equation. You fill in each variable in the equation and then the model calculates and gives higher scores to customers with the greatest probability of visiting your store more times in the next year.

Usually, you don’t have to score one customer at a time. You can build a model to automatically score a database of these higher probability customers.

How Can You Use Modeling?
Modeling can apply to marketing in lots of ways. Eric Siegel, President, Prediction Impact Inc., works with marketers through his firm’s predictive analytics consulting and training programs. Here are some examples he has seen:

- Targeting retention efforts
“Retention, to be effective, is generally going to incur costs. It can be pretty expensive, such as, ‘We’re going to give you a month for free,’ or, ‘We’re going to give you 20% off your next three orders.’ Whatever it is, it costs money, so you can’t offer it to everybody,” says Siegel.

By creating a model that will determine the probability that a customer will renew a subscription, you “don’t waste the offer on somebody who’s going to stay anyway. You only use it on people who are at greater risk of defecting. When you do that, you can use your marketing retention campaign budget much more effectively. The bottom line works out much better. In general, you take the scores and you use them to order your list. So you put the scores that are highest at the top, the people most likely to defect are the ones that you’re going to tackle first.”

- Selecting content
When customers are on your website, you can use their browsing history and their behavior to determine the content they see. Siegel has seen systems designed to deliver the one of 120 different promotions that a customer would be most likely to respond to. You can do a very similar modeling with email.

- Using A/B tests
The common reaction to an A/B test where A outperforms B is to throw away B and always use A. But there are potentially segments that would only convert with B, and you don’t want to lose them, right?

Well, you can create a model to predict which customers are most likely to respond to A and which are most likely to respond to B. Knowing this would prevent losing conversions by only using one or the other. “Instead of A/B testing, it’s A/B selection. You’re dynamically selecting A or B individually for each customer according to that customer’s chance of responding,” Siegel says.

- Applying survey data
Khosrow Hassibi, Senior Technical Director and Data Mining Architect, KXEN, provided an example from a car manufacturer who surveyed about 35,000 customers to gauge their level of interest in buying a cell phone-accessory kit for their cars.

“About 1% of the 35,000 individuals responded positively. It was recorded in a database,” says Hassibi.

Information then was purchased from a third-party data provider to add to the positive responders’ information. That created about 250 different variables for every responder, including hobbies, magazine subscriptions, number of homes owned, etc.

The car maker ran the modeling software over that data and set the target as a willingness to buy the cell phone kit. A model that predicted which customers were the best prospects for the kit was created. The model also indicated which variables were most important for hitting the target.

Mini-Case Study: World Wildlife Fund
The World Wildlife Fund focuses on conservation in 19 areas of the world, which gives the nonprofit a massive file of members. They create 120 direct mail solicitation campaigns a year, which could be prohibitively expensive if they targeted their entire list for each mailing. A typical campaign will go to about 25% to 30% of their members to keep costs down.

“In the past, we had been using a more segment-based approach [to select whom to mail],” says John Schwass, Director, Strategic and Financial Analysis, WWF. “We would take the ROIs of various segments and then we would use that as a determination of whether to mail somebody or not.”

WWF did its first modeling-based direct mail campaign last fall. Because of the results, they now use predictive analytics to better target all their efforts, says Greg Smith, VP and CIO, WWF.

“The model-based approach really takes a look a lot more granularly at how people are behaving,” Schwass says. “If we sent them 100 pieces of mail, how many responses would we likely get? And then what amount of money likely get from them? Those are the kinds of giving variables.”

Here is the process they used for the first modeling campaign:

-> Step #1. Set a goal

Schwass and his team wanted to find the top 25% of their list most likely to respond to the direct mail solicitation.

-> Step #2. Organize customer data

Working with the IT department, the marketing team brought together databases, organized the data and formatted it. They also determined which variables they wanted the software to review. They did not depend on the software for that task.

“All the hard work is really getting the data together and getting the variables together and thinking through effectively creating variables in different ways and what positives and what negatives that will have on the ability of your model to predict accurately,” Schwass says.

The first information the team looked at was whether a member responded to previous solicitations and their giving patterns before 2006.

-> Step #3. Run the modeling software

After telling the software which were the top variables to consider, Schwass and his team scanned all the customers and their variables in the database. It then decided how important each variable was to the target and created a model.

-> Step #4. Use the model

“Once we have the model, we get new data, the most recent data we can, right before the mailing, and we score all the people we can. We just score everybody.” Schwass and his team arranged their mailing list by probability score.

-> Step #5. Send the mailing

Next, they mailed to the top 25% of the list -- about the same number of members they mailed to the previous year. There were six mailings in the series. “If we felt very strongly about the value that the model places on people, if for whatever reason we thought ‘wow, it’s not profitable to mail these people,’ then we might consider not mailing them. In this particular mailing, we didn’t have that issue.”

-> Step #6: Monitor results

After mailing to approximately the same number of members as the previous year, when they used RFM analysis (recency, frequency, monetary value), the model-based mailing showed:
o 172% higher ROI
o 25% more donations
o 28% higher average gift size
o 125% lower cost per piece mailed

“The result is that we don’t use RFM at all in this particular mailing program. So in all future mailings now we no longer use that approach. We just use the model-based approach,” says Schwass.

- Test the data
Schwass and his team took an additional step because they did not wholeheartedly trust the modeling process. They wanted to test it before spending thousands on postage and paper.

So, they ran a simple test: Once all the data was organized, but before the model was created, they removed a sample with members who had donated in the past and members who had not.

After they created the model and scored the list, they scored the sample. They checked if most of the members in the sample received scores that corresponded to their giving history. Seeing that everything appeared to be accurate, the team moved forward with the mailing.

“I think our comfort level comes at the base level from how a model performs on that hold-out set. It’s usually a pretty good indication if you’re over-predicting or under-predicting or just mis-predicting” the response, says Schwass.

Analytics Expertise Required
Modeling software varies in ease of use, but every marketer and vendor we talked to said portions of the process require an expert.

Hiring a specialist to work on your team has a range of benefits. But it can be expensive. Specialized statisticians with PhDs and practical backgrounds can be paid $140,000 or more, depending on the region, says Matthew Schall, Senior Manager, Direct Marketing, Musician’s Friend, a musical equipment cataloger.

Here are some strategies where you might get tripped up without an expert on staff:

-> Strategy #1. Organizing the data

The most labor intensive part of the modeling process is gathering the right data and formatting it to be read by the software.

“Someone really has to have an eye for data and experience with data in order to see if the data makes sense. Most marketers don’t have enough experience with enough different kinds of data to really be able to catch it when there’s something screwy, ” says Schall. “You have to really have a knowledge and understanding of data to be able to construct the data in such a way that it corresponds to the analysis, and that can be something that is not obvious or intuitive.”

-> Strategy #2. Opening the black box

Plenty of automated modeling software simplifies the process (as long as your data is already organized and formatted).But only an analyst can tell you why a model came out in a certain way.

“You can trust the software to do the analysis for you, and you can actually get a good boost. You get a lot of insights from those default settings. But if you want to get in and do something different, you just want to put a little emphasis on that variable … [or] just find more insights and delve deeper. That’s where companies need to balance the idea of hiring their own analyst or just running off the software,” says Mary Grace Crissey, Analytics Product Marketing Manager, SAS.

-> Strategy #3. Selecting variables and determining importance

Not all modeling software will automatically determine which variables are most relevant to a target. You may need a skilled statistician to identify or create the strongest variables.

“A statistical person is going to dive deep and analyze every single one of those fields to decide which ones should go in the model,” says Dwight Mouton, Marketing Optimization Product Manager, SAS.

-> Strategy #4. Tweaking models

Modeling software knows the data it scans and nothing else. It does not know your business. If you know that a variable has more relevance than a model is accounting for, you’ll need a statistician to tweak it.

“You may want to tweak it to say that I’m really going to put double or triple power, or weight, on this age variable because it’s something I know that wasn’t true in the past but seems to be going on now,” says Crissey.

-> Strategy #5. Getting beyond the basics

Moving beyond basic modeling is insightful but extremely complicated.

“Let’s say, just as an example, there are people who buy frequently at low value. They buy pretty often, but they don’t spend very much each time they buy. Then there are people who buy pretty often, and they spend a lot more. Then there are people in the middle who buy pretty often, but spend a lower amount. And that’s not an unusual pattern. [Using models to discover those people] is not easy. It’s not something I’ve seen the typical MBA and typical marketer be able to do,” says Schall.

-> Strategy #6. Training others

The value of a good statistician can extend beyond organizing data, creating models and executing campaigns. Hiring an expert gives you someone to explain the process, train marketers, point out land mines and establish safeguards.

Also, “I like building models in house [because] that knowledge and learning stays in house, and that’s incredibly valuable. And that’s something you don’t get when you bring in an outside consultant to do everything,” says Schall.

Start With a Pilot Initiative
So, you think you’re ready to dive into predictive modeling. You see the benefits; you understand its potential value relative to costs. But you still want to go slow. Start with a simple model to find a simple answer that can be easily applied and monitored.

“Set your sights on a doable project that can be done in a short time frame. You want to demonstrate some quick wins,” says Hren.

- Set up a simple test
“A great place to start can be putting together your data of customers the way they looked a few months ago. And then for each of those customers, track what happened over the next three months,” Siegel says. “Then you can apply that model to today’s world and see if you can predict what’s going to happen to every customer over the next few months.”

That test can reveal several things:
o Accuracy of your predictions
o Important segments of your customers
o Key variables to determining customer behavior

And it will build your teams comfort level with modeling.

- Try software for free first
“Pilot initiatives can be done relatively quickly and easily and can be done without any software costs since most of these vendors allow for evaluation licenses where you can just go with the free software,” says Siegel.

- Stay realistic
A model can predict that 60% of your list is more than 50% likely to respond to an offer. But don’t expect a 60% response rate.

“Don’t expect miracles. It’s not going to be 100% accurate, but it doesn’t have to be,” says Hren. ”If you’re getting a .5% response rate, it’s unlikely that the model is going to get you a 42% response rate. That’s just not going to happen. The world does not work that way. You can make lots of money with very small increases in effectiveness [and response].”

Getting Help
Plenty of information on modeling exists. We put together a list of resources where you can get help and learn more. Check our hotlinks section, too.

Note: Predictive analytics has a range of uses and supporting services. Not every resource focuses exclusively on businesses and marketing.

- Modeling software companies
There are many types of modeling and data mining software. Some packages are very sophisticated. They’ll help to do everything from collect the data, manipulate the data and interpret the models. Other packages are simpler. They can be easier to use but offer fewer types of analysis and less flexibility.

“The smaller ones tend to be more technically adept, but have less data preparation support infrastructure, and they also tend to be cheaper,” says Siegel.

Below is a look at two of the most popular predictive modeling vendors, a specialty vendor and a free system. But do not take this as an all-inclusive list. There are hundreds of types of modeling software. (Check the KDnuggets hotlink below for a directory.)

SAS
http://www.sas.com/
Description: SAS’s software packages range from hardcore statistical workbenches to tools that do most of the work behind the scenes.
Capabilities: SAS products have a wide range of features. They can help retrieve and prepare data, identify valuable variables, build multiple types of models, provide graphical reports and manage many other problems.
“SAS will let you build out everything and anything you could imagine,” says Musician’s Friend’s Schall.
Price: As with most of the software packages, SAS’s pricing varies depending on the configuration. For small or medium sized businesses, the SAS Enterprise Miner Desktop version starts at $38,000. SAS Enterprise Miner, a client/servicer version used by larger organizations, starts at $162,000.

SPSS
http://www.spss.com/
Description: SPSS’s software packages also offer a host of capabilities. Their products range from the most technical and flexible to the more automated and simple.
Capabilities: SPSS software can help retrieve and manipulate data, build various types of models, analyze models, view graphical reports and offer many other capabilities.
Price: SPSS Clementine 12.0 starts at as low as about $10,000 for a desktop installation with additional costs per number of licenses (or seats) and modules. Enterprise-wide solutions can reach $100,000-plus.

KXEN
http://www.kxen.com/
Description: KXEN’s software is more automated than SPSS’s and SAS’s. But it is not designed to be a major statistical workbench. Instead, it’s made to be an easy-to-use model-making machine. The tradeoff for speed and automation is less flexibility.
Capabilities: KXEN does not offer as much data-organizing architecture. But after the data is organized, it only takes a few clicks to set a target and generate a model. KXEN will automatically determine which variables are most important to the target and weigh them accordingly. KXEN also provides graphical analysis.
Price: KXEN offers three packages, all priced according to the number of users, servers and other factors. Prices range from the tens of thousands for “an individual user performing a single type of analysis to well over a million for large-scale use,” says Louis Olds, Director of Marketing and Communications, KXEN.

R
http://www.r-project.org/
Description: R is one of the most popular free and open-source statistical analytics packages. It can conduct a wide range of statistical analysis, but it does not have the user-friendly GUIs that many other software packages offer.
Capabilities: You can see a graph comparing the features of SPSS, SAS and R from the University of Tennessee Knoxville in the hotlinks below. R is capable of doing many of the same calculations as SPSS and SAS, but its software relies on open-source add-ons instead of one consistent tool.
Price: Free.

- Online portals
KDNuggets is a good resource for data-mining information. At their website, you can find a consulting directory, a seminar directory, job postings, information about software and more.

TDWI specializes in business intelligence and data warehousing education. Plenty of free information and research is available on its website. TDWI also offers education and certification services.

- Classes, seminars & conferences
Classes are a great way to decide if modeling is right for your business. We’ve listed a few seminars in our links section. Also, every vendor we talked to for this article offers training -- on how to use their software or on modeling in general.

“While it will be painful for somebody, I would recommend that every marketer who is going to be doing analytics go through every training class that’s available at the ART, the advanced research technique classes at the ART forum. And that’s just well worth the investment,” says Schall.

- Consultants & agencies
Using a consultant is a good way to test the water. You don’t have to fully commit by paying for software, training and an expert. You can see the results of one campaign and decide how you want to move forward.

Useful links related to this article

Past Sherpa articles –
How Gateway Created an In-House Model to Predict the PC Marketplace
http://www.marketingsherpa.com/article.php?ident=23201

The New Metrics Frontier: Predictive Modeling for Email Campaigns
http://www.marketingsherpa.com/article.php?ident=23479

University of Tennessee Knoxville: Comparison of SAS and SPSS Products with R:
http://oit.utk.edu/scc/RforSAS&SPSSproducts.pdf

The R Project for Statistical Computing:
http://www.r-project.org/

useR!: The R user conference:
http://www.statistik.uni-dortmund.de/useR-2008/

SAS:
http://www.sas.com/

SAS: Support and training:
http://support.sas.com/

SPSS:
http://www.spss.com/

SPSS Training:
http://www.spss.com/training/

KXEN:
http://kxen.com/

KXEN: Support, training and education:
http://www.kxen.com/index.php?option=com_content&task=view&id=68&Itemid=194

KDnuggets:
http://www.kdnuggets.com/

KDnuggets - courses on data mining, analytics and Web mining:
http://www.kdnuggets.com/courses/index.html

TDWI:
http://www.tdwi.org/

World Wildlife Fund:
http://www.worldwildlife.org/

Prediction Impact - predictive analytics training and services:
http://www.predictionimpact.com/


Improve Your Marketing

Join our thousands of weekly case study readers.

Enter your email below to receive MarketingSherpa news, updates, and promotions:

Note: Already a subscriber? Want to add a subscription?
Click Here to Manage Subscriptions