Date: Mon, 27 Dec 2004 17:29:18 -0500
Reply-To: Jing Cheng <bo_flying@YAHOO.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Jing Cheng <bo_flying@YAHOO.COM>
Subject: negative binomial diagnostic
hi, I have a few questions and want to get some kind suggestions and second
opnions.
I am modeling crash number on each road segment with certain road
properties. So Y=# of count X= road length, traffic volume, shoulder length
and so on.
Q1: GEE or taking average?
I have 4 years data for each road segment, so either Y=average count of 4
years, or use repeated measure by GEE. I thougt GEE will be a better
approach, but there will be more 0 crashes in the dataset then taking the
average. It came from 0 crash for a particular year for certain road segment.
Expecially, if I want to model a subset of the crash (fatal crash), excessive 0
exists .
Q2: curvature in residula plot
The residual plot has the random points and 4 curves . It seems one curve is
for one certain observed count value. ie. one curve is all points have Y=0, the
second one is Y=1 and so one.
I guess it is due to the count data, but not sure is it not a problem at all.
Q3:check point for diagnostic
what I did:
1) deviance/df close to 1 or not
2) residual plot
3)plot CDF of the predicted and observed value and see how close they are ?
4)Suedo R^2= 1- (Null Deviance-Model Deviance)/ Null Deviance
It seems that most of people do not show their diagnotics for poisson and nb
model in their report. Is this just because they are not that crutial at all so
people do not care?
I have low suedo r^2 now and it may because the random nature of the
accident data. I am not sure what else to do to improve ?
Thank you very much and happy new year to you all.
Jing Cheng
Purdue Univeristy