| Date: | Fri, 25 Jul 2008 10:52:42 -0500 |
| Reply-To: | Robin R High <rhigh@UNMC.EDU> |
| Sender: | "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU> |
| From: | Robin R High <rhigh@UNMC.EDU> |
| Subject: | Re: Long Book - Zip Model |
|
| In-Reply-To: | <bdc65f270807250806t1f8c1528q9942252823a9e8a5@mail.gmail.com> |
| Content-Type: | text/plain; charset="US-ASCII" |
|---|
Jeff,
When one looks at the correlation matrix of the parameters, you'll see
that phd is highly correlated with the intercept (ABS(corr(Intercept,PHD)
> .8). When you remove phD from the model, all the other parameter
estimates are virtually the same, with or without the '2nd' intercept:
remove phd from both the logit and poisson portions and add the n_p term:
Standard
Parameter Estimate Error
bp_0 -0.576 0.3300
bp_fem 0.108 0.2802
bp_mar -0.354 0.3175
bp_kid5 0.219 0.1959
bp_ment -0.134 0.04231
bll_0 -0.689 0.03515
bll_fem -0.209 0.06341
bll_mar 0.105 0.07082
bll_kid5 -0.143 0.04735
bll_ment 0.018 0.002216
n_P 1.311 0.03515
now run it without phD and without the n_P term
Standard
Parameter Estimate Error
bp_0 -0.576 0.3300
bp_fem 0.109 0.2802
bp_mar -0.354 0.3175
bp_kid5 0.219 0.1959
bp_ment -0.134 0.04231
bll_0 0.621 0.07029
bll_fem -0.209 0.06341
bll_mar 0.105 0.07082
bll_kid5 -0.143 0.04735
bll_ment 0.018 0.002216
The moral is, to always be aware of possible correlations of the parameter
estimates in the X data when running NLMIXED, or in this case, a variable
that does not "vary" much (in relation to the intercept and other terms)
Robin High
UNMC
"Jeff Allard" <jeffrey.m.allard@gmail.com>
07/25/2008 10:06 AM
To
"Robin R High" <rhigh@unmc.edu>
cc
SAS-L@listserv.uga.edu
Subject
Re: Long Book - Zip Model
Hi Robin-
Thanks (I think :-) ). Anyone have insight into why this works?
And Robin - what did you mean in regards to computing the log-likelihood
directly?
Thanks for the help!
Jeff
2008/7/25 Robin R High <rhigh@unmc.edu>:
Jeff,
I can't explain why the following modification "works", yet the results
match the output: add "2" intercept terms to the Poisson part, the
remaining coefficients produce the results found in Long's table for the
Poisson example, that is, add n_P to the intercept:
etap = (b0 +n_P) + b1 * fem + b2 * mar + b3 * kid5 + b4 * phd + b5 * ment;
You can then produce the estimate the value of the Poisson intercept
with:
ESTIMATE 'B0' b0 + n_P;
The parameter and additional estimates (in my linear predictor coding) are
then:
Standard
Parameter Estimate Error
bp_0 -0.577 0.5094
bp_fem 0.110 0.2801
bp_mar -0.354 0.3176
bp_kid5 0.217 0.1965
bp_phd 0.001 0.1453
bp_ment -0.134 0.04525
bll_0 -0.680 0.06065
bll_fem -0.209 0.06340
bll_mar 0.104 0.07111
bll_kid5 -0.143 0.04743
bll_phd -0.006 0.03101
bll_ment 0.018 0.002295
n_P 1.320 0.06065
Additional Estimates
Standard
Label Estimate Error
INT: B0 0.6408 0.1213
A similar trick also gets the coefficients for the neg bin example in the
next columns, though they aren't as close.
And I would compute the log-likelihood directly, whenever possible, as
Kevin suggested.
Robin High
UNMC
Jeff <jeffrey.m.allard@GMAIL.COM>
Sent by: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
07/25/2008 09:30 AM
Please respond to
Jeff <jeffrey.m.allard@GMAIL.COM>
To
SAS-L@LISTSERV.UGA.EDU
cc
Subject
Re: Long Book - Zip Model
On Jul 25, 9:54 am, citam.s...@GMAIL.COM (Kevin Viel) wrote:
> On Fri, 25 Jul 2008 05:01:14 -0700, Jeff <jeffrey.m.all...@GMAIL.COM>
wrote:
> >I am trying to match the output in j scott longs book Regression
> >Models for Categorical and Limited Dependent Variables page 246.
> >Specifically trying to model using a ZIP model using NLMIXED. I get
> >very different coefficient estimates. Can anyone help with why - am I
> >specifying a ZIP incorrectly?
>
> >Here is the code I am running:
>
> >/*zip*/
> >proc nlmixed data = data;
> >parms a0 = 0 a1 = 0 a2 = 0 a3 = 0 a4 = 0 a5 = 0
> >b0 = 0 b1 = 0 b2 = 0 b3 = 0 b4 = 0 b5 = 0 ;
> >eta0 = a0 + a1 * fem + a2 * mar + a3 * kid5 + a4 * phd + a5 * ment;
>
> >exp_eta0 = exp(eta0);
> >p0 = exp_eta0 / (1 + exp_eta0);
>
> >etap =b0 + b1 * fem + b2 * mar + b3 * kid5 + b4 * phd + b5 * ment;
> >exp_etap = exp(etap);
>
> >if art = 0 then ll = log(p0 + (1 - p0) * exp(-exp_etap));
> >else ll = log((1-p0)*(exp(-exp_etap)*exp_etap**art)/fact(art));
>
> >model art ~ general(ll);
>
> >predict exp_etap out = zip_out1 ;
> >predict p0 out = zip_out2 ;
> >run;
>
> <snipped>
>
> >Any insight is appreciated - I cant figure out what I'm doing wrong.
> >thanks!
>
> A link to the data would be helpful, if it exists. Your code looks
> correct, so my next thought would be how the covariates are coded. Are
> they yes/no (0/1)? I assume that the issue is not a simple matter of
> replicating the referent groups. The next thought would be to be sure
that
> you are consistent with the meaning of p0. That is, does it Long model
it
> as the probability (odds) of the datum coming from the point mass at
zero
> (logit part) or from the Poisson distribution? I would think that this
> would result in reciprocals for the estimates of the a's, but *suspect*
it
> would alter the estimates of the b's. Hmmm, interesting thing to test.
>
> Sorry I could not be of more help, but Long does not seem to have the
data
> on his website and I do not have a copy of the book.
>
> Kevin- Hide quoted text -
>
|