>Thank you all for responding to this question I posted yesterday. However,
>I would like to repost it so I can get your responses to the following
>I want to test the response performance of using the (logistic) model
>versus that of using no model at all
>(namely, a random selection of prospects to mail to). I am in banking,
>direct mail campaigns.
>I have done all the things people have mentiones pertaining to
>probabilities, deciles, etc. as I wrote in my email yesterday. But here is
>the thing I cannot understand because my brain isn't working any more!!!
>Description of the REAL situation:
>I (well, actually the vendor did) used a model to score 1 million
>I deciled the probabilities of response into DECILE1 (low probbaility of
>response) to DECILE10(high probability of response).
>We are now ready to send out our new direct mail campaign.
>We will choose prospects from the high deciles DECILE7 through DECILE10.
>We also want to put aside a control group to test what we said above.
>I agree with all of you that the control group, say 10% of the 1 million,
>should be randomly selected from the 1 million prospects in ordet to see
>the random response rates and the model response rates.
>However, here is what really happened bec. this was done by the vendor.
>The vendor has the 1 million prospects. They scored them using the model
>and deciled the probabilities of response as said above. They campaigned to
>prospects from DECILE7 through DECILE10. They also selected 10% as a
>CONTROL GROUP from DECILE7 through DECILE10. They did this because they had
>no communication with me. They just forgot me in this whole process.
>Anyway. The results of the campaign came back. Before I give the results, I
>note that in a previous campaign of this kind, without using the model, the
>response rate was something like 0.13%. The average probability of response
>using the model for DECILE7 through DECILE10 is 0.26%, pretty good, I'd
>say, if the model works in real life as intended.
>The campaign results came back:
>The response rate of the mailed group was 0.25%. (Right on the money, like
>the model said! Vendor's validation, not mine. Confidentiallity issues,
>privacy issues, etc. don't ask)
>The response rate of the control group the vendor put aside (rememeber,
>they randomly selected 10% from the prospects in DECILE7 through DECILE10)
>came in at about 0.32%. (Pretty close to the 0.25%, I'd say, right, You
>agree with me?)
>Here is where my brain doesn't function any more.
>Why did the mailed group and control group had pretty much identical
>response rates? I understand that had the vendor chosen 10% from the entire
>1 millin records, the control response would have been closer to about
>The manager(s) here tell me that the model failed to meet expectations
>since the control group did the same (and slightly better) than the mailed
>group. In my gut I know the model deliverd twice expectation (0.26% vs
>0.13%), but management doesn't buy it.
>They tell me the control response is as good as model. In other words, they
>tell me, these people in the control would have responded anyway without
>the model because that's the objective of a control group, to see who will
>respond on their own. That's where management is stuck and I cannot seem to
>say the right english statements to
>convince them otherwise. They keep saying that 10% control (remember from
>DECILE7 through DECILE10, my high probability of response prospects) that
>the control would have responsed on their own, which is what they did,
>without the use of the model.
>How can I explain to management what the real meaning of the response in
>the control group, 0.32%, vs. the response rate in the mailed group,
>0.26%, really mean? They are stuck with this sentence: "Those people in the
>control group who had a response rate of 0.32% would have responded on
>their own and that's exactly what they did. Your model response rate of
>0.26% is not better than random of 0.32%. Therefore your model does not
>I AM GOING CRAZY!!!!!!!!! PLEASE HEEEEEEEEEELP.
There are a host of potential problems here. I see that I am fairly late
onto this dogpile, but let me make a few points, some of which have been
If the vendor is doing the selections without any regard to your needs, you
also CANNOT assume that you are getting random samples from their database.
Point out to your boss that this may be a serious handicap. This is not a
'the analyst can handle it' problem. This is a BUSINESS CASE problem. This
is potentially a 'your boss needs to take it to his bosses' kind of problem.
Find out how the vendor is drawing these samples. I mean, EXACTLY how they
are drawing the samples, including the actual code used. There is a chance
the problem you are seeing is really due ot the way they drew the samples.
Perhaps you need to hire someone to meet with the vendor to discuss
sampling issues and appropriate sampling methodologies. People can lose a
lot of money just because crummy, not-actually-random samples are being
mistakenly assumed to be okay.
Your model is presumably picking the people most likely to respond to a
you may be picking the people most likely to buy, period. In which case, I
say that your model has succeeded. It's just that the marketing campaign
failed to improve the response rate. But your model may not be able to pick
information out of the available data.
I cannot tell whether your 26% is statistically different from your 32% .
not as easy as pretending that you have a binomial distribution with
having the same likelihood of responding. You have a complex model, where
the probability of responding is in fact a variable itself, and that
variable is driven
by a host of factors, such as you have in your model. You might think of it
as a mixture of many beta-binomial distributions, just for your own
So 32% may NOT be statistically different from 26% here. It's impossible to
without access to stuff which your boss would probably put under an NDA
before letting it out of your office.
Any model like you have built is only based on available data. So you
expect to be able to extrapolate to something like 'what would happen under
an entirely new marketing campaign that these people have never seen
Depending on the data in the vendor's database, you may be looking at
that someone would buy, or likelihood that someone would respond to a
or likelihood that someone would respond to a more involved marketing
or a mixture of these. So oyu may be trying to extrapolate farther than the
want to go.
David L. Cassell
3115 NW Norwood Pl.
Corvallis OR 97330
Donít just search. Find. Check out the new MSN Search!