| Date: | Fri, 20 Aug 2004 14:25:33 -0400 |
| Reply-To: | Peter Flom <flom@NDRI.ORG> |
| Sender: | "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU> |
| From: | Peter Flom <flom@NDRI.ORG> |
| Subject: | Statistics question - zero inflated models, skewed distributions,
assumptions etc |
| Content-Type: | text/plain; charset=US-ASCII |
Not really a SAS question, but I've gotten good advice here in the past
I am trying to model a variable that has a huge proportion of 0's, and
a long right tail. I've tried fitting all sorts of models, in
particular, zero inflated Poisson and zero inflated negative binomial
models.
For all the models I've tried, the residuals are grossly nonnormal with
HUGE values for both skewness and kurtosis. The problem is that there
are a bunch of outliiers. I also tried fitting separate models for 0 vs
1 and then one for everyone with 1 or more. I also tried winsorizing
the DV, making the highest value 50.
N is fairly large (at least, compared to what I'm used to) - it's about
5000.
The DV has a mean of 0.69, sd = 4.02, skew = 30.7, kurtosis =1294 (not
a typo, kurtosis over 1000). Similar values hold for residuals from
many of the models (OK, SOME models reduce kurtosis to 400 or so.....but
you get the idea).
Any ideas?
TIA
Peter
Peter L. Flom, PhD
Assistant Director, Statistics and Data Analysis Core
Center for Drug Use and HIV Research
National Development and Research Institutes
71 W. 23rd St
www.peterflom.com
New York, NY 10010
(212) 845-4485 (voice)
(917) 438-0894 (fax)
|