Date: Wed, 23 Jul 2003 23:32:46 -0400
Reply-To: Arthur Tabachneck <atabachneck@ROGERS.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Arthur Tabachneck <atabachneck@ROGERS.COM>
Subject: Re: Odd Results with Proc Summary Missing Value assignment
In-Reply-To: <20030723234404.73629.qmail@web21109.mail.yahoo.com>
Content-Type: text/plain; charset="us-ascii"; format=flowed
Dale,
No need to take up the list's space, as I can easily solve my problem by
converting missing values back to zeros. I was just concerned that others
might have faced the same error found in my own data.
To quickly answer your question of what I am computing, I am analyzing (and
modeling) loss cost (i.e., the amount everyone must pay to cover all
insurance losses. Loss cost is easy to model by modeling two related
factors which have known distributions, namely frequency (how often a claim
occurs), and severity (average claim cost). Loss cost, then, is simply the
product of the two factors.
However, in aggregating such data to even broader levels (e.g., identifying
the loss cost for SUVs given raw data of make and model, number of
vehicles, number of claims, and total loss), Proc Summary returns correct
averages for frequency (as its weight is number of vehicles) but, for
severity, returns missing values for make/model vehicles which have no
claims. Since, one way to obtain the desired loss cost is to then merge
the results of the two proc summaries, and then simply multiply the average
frequencies and severities, an incorrect answer is obtained if the missing
values aren't first converted to zeros.
In short, an easy problem to solve, but a quite misleading analysis if one
doesn't know to correct the Proc Summary results before attempting to use
the desired resulting measure.
Art
|