Evaluating Scorecard Models

You have put in a lot of hard work generating a scorecard model.  Just imagine, you have looked through possibly hundreds of predictor variables and selected those that were most important to your model.  You’ve discretized them, looking at weight of evidence to verify that they have been properly prepared for use in developing your model.  All of your discretization scripts have been used to prepare your predictors for use in building a logistic regression model which is in turn used to create your scorecard.  Now that you have your model, how do you determine if your model performs as you expect?

There is a host of statistics and graphs that you can use to help you determine if your model is performing at the level you expect.  There is the Kolmogorov-Smirnov statistic which is a measure of how much the probability distribution of the “goods” differ from the “bads,” and varies from a low of 0 to a high of 1.0. The Gini score reflects the overall unevenness in the relative frequencies of values along the range of scores, or a measure of the predictability of a model, and also ranges from a low of 0 to a high of 1.0. Divergence is a measure of the overall minimum distance between the “goods” and “bads,” and ranges from a low of 1.0 to high positive values. The Hosmer-Lemeshow value is also a form of a minimum distance test incorporating Chi-Square values, and it is evaluated like an ordinary Chi-Square value. The Receiving Operator Characteristic (ROC) curve is created by plotting the true-positive rate (sensitivity) over the false-positive rate (1-specificity). The area underneath the ROC curve varies from a low of 0 to a high of 1.0, the entire area between the axes.  Finally a lift chart helps you visualize the effectiveness of a model and is a measure of the effectiveness of a predictive model calculated as the ratio between the results obtained with and without the predictive model.

Needless to say, going through each of these would require more than one blog, and really you need to contrast and compare the results of many of these statistics and graphs to see how well your model is performing.  To wet your taste for comparing your models, I’m going to stick to one of these options, the lift chart.

A lift chart is shown above.  The X-axis is graduated in terms of deciles, or bins of 10% of the total cases modeled. The Y-axis is graduated in terms of lift index or a factor expressing how much better the model performs in each decile. The model line is plotted by determining the ratio between the results predicted by our model compared to the results using no model.

In the lift chart, you can see that the lift values in the lower deciles are higher than the expected value plotted at 1.0, indicating that the model has a relatively high predictive power.  What does this mean?  For now let us focus on the 10% decile.

If we contacted 10% of all our customers using no model at all to decide what customers to contact, we could expect a response rate of 10%, with that 10% consisting of positive and negative responses.  However, if we used our model to select 10% of our customer base we could expect a response rate of between 22% and 24%.  That is a lift of between 2.2 and 2.4, meaning our model performs 2.2 to 2.4 times better than no model at all.

Does it make sense to use the model to select more customers to contact?  If you contacted 80% of your customer base with no model you would expect an 80% response rate.  With the model being used to select those customers, you could expect a response rate of 96%, but that is only 1.2 times better than no model at all.

Is that lift of 1.2 worth the extra cost of contacting 70% more of your customer base?  That’s for you, the content expert, to decide.  With the tools made possible to you through lift charts and a whole host of other statistics and graphs for evaluating your scorecard model, you can have the insight on how to make more informed decisions for your company to maximize your profit and reduce your risk.

Advertisements

About statsoftsa

StatSoft, Inc. was founded in 1984 and is now one of the largest global providers of analytic software worldwide. StatSoft is also the largest manufacturer of enterprise-wide quality control and improvement software systems in the world, and the only company capable of supporting its QC products worldwide, with wholly owned subsidiaries in all major markets (StatSoft has 23 full-service offices, on all continents), and its software is available in more than 10 languages.

Posted on January 14, 2013, in Uncategorized. Bookmark the permalink. Leave a comment.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: