LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (May 2002, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Thu, 9 May 2002 13:43:39 -0700
Reply-To:     Dale McLerran <stringplayer_2@YAHOO.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Dale McLerran <stringplayer_2@YAHOO.COM>
Subject:      Re: Lognormal
Comments: To: "Elmaache, Hamani" <Hamani.Elmaache@CCRA-ADRC.GC.CA>
In-Reply-To:  <B92446989C5AD4119239006008C407A7011E93B0@SH0CX691>
Content-Type: text/plain; charset=us-ascii

Hamani,

It is the original variable var0 which may or may not have a zero inflated poisson distribution. That is, apart from the excess of zero values, the values of your original variable could conform to a Poisson distribution if those values were integer. Note that when you construct var_infla assinging integer values based on some condition, then var_infla will not have a Poisson distribution even though it is integer. You might have a mixture of zero values and a lognormal except that you have some negative values amongst your observations. Again, the questions must be raised "How do the data arise?" and "What do you wish to do with the data?"

Just for the record, supposing that you did indeed have a ZIP model, then the parameter estimates could be obtained employing the procedure NLMIXED. The code below generates 1000 observations from a ZIP distribution, tabulates the observations to show the high frequency of zero values, and fits a ZIP model employing NLMIXED. When fitting the ZIP model, I test whether a confidence interval for the parameters includes the true values. A similar ZILN (zero-inflated log-normal) model could, I am sure, also be constructed and estimated. You may want to consider estimating such a model, though the presence of negative values is a problem that must be accounted for.

/* Generate 1000 observations from ZIP dist with P_0=.9, lambda=20 */ data test; p_0 = .9; lambda = 20; seed = 1234579; do i=1 to 1000; call ranbin(seed,1,p_0,x); if x=1 then y=0; else call ranpoi(seed,lambda,y); output; end; keep y; run;

/* Tabulate the data demonstrating high zero count probability */ proc freq data=test; tables y; run;

/* Fit the ZIP model to estimate parameters p_0 and lambda */ /* Test whether the confidence interval for the parameters */ /* p_0 and lambda contain the true values. */ proc nlmixed data=test; parms p_0=.1 lambda=1; if y=0 then prob = p_0 +(1-p_0)*exp(-lambda); else prob = (1-p_0)*(lambda**y)*exp(-lambda)/fact(y); loglike = log(prob); model y ~ general(loglike); contrast "Test p_0=.9" p_0-.9; contrast "Test lambda=20" lambda-20; run;

I think you are on the right track in considering a ZIP model, but be sure to work with the correct response. One other point to mention is that the ZIP model cannot contain negative values. How do these negative values arise? If you believe that the ZIP model is appropriate for the situation, then those observations with negative values must either be censored from the data set or they must have be dropped from the dataset when fitting the ZIP.

Dale

--- "Elmaache, Hamani" <Hamani.Elmaache@CCRA-ADRC.GC.CA> wrote: > > Hi Steve . > What I want is to carry out a Multivariate REGRESSION where var0 is > dependent variable. My variable is somewhat continuous. > Given the huge ponit mass at zero (around 85% of zero) I thought to > make the > Zero-Inflation Poisson Regression, by categorizing my Variable var0 > using, > for example the code, > data mydata; > > set mydata; > > if var0 <=0 then var_infla=0; > > else if 0< var0<=200 then var_infla=1; > > else if 100< var0<=1000 then var_infla=2; > > else var_infla=3; > > run; > > I get following distribution: > > 0=No difference or negative 85.70 % > > 1=Small difference 8.26 % > > 2=medium difference 4.31% > > 3=large difference 1.73% > > But how to carry out the zero-inflated Poison( ZIP)? > > The var_infla should have a zero-inflated Poisson distribution > > Pro(var_infla=y)= P_o +(1-P_o)exp(-Lamda) if y=0 > > =(1-P_o)*(Lamda**y)*exp(-Lamda)/y! if y >0 > > (ie: 1,2, 3) > > where P_o should be proportion of zero. > > with mean and variance: > > E(var_infla)=(1-P_o)*Lamda=mu > > Var(var_infla)=mu + P_o/(1-P_o)mu**2. > > I don't know how to set up this model in SAS. > > Any comments/ ideas appreciated. > > Thanks. >

===== --------------------------------------- Dale McLerran Fred Hutchinson Cancer Research Center mailto: dmclerra@fhcrc.org Ph: (206) 667-2926 Fax: (206) 667-5977 ---------------------------------------

__________________________________________________ Do You Yahoo!? Yahoo! Shopping - Mother's Day is May 12th! http://shopping.yahoo.com


Back to: Top of message | Previous page | Main SAS-L page