LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (May 2005, week 4)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Tue, 24 May 2005 17:13:42 -0700
Reply-To:   cassell.david@EPAMAIL.EPA.GOV
Sender:   "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:   "David L. Cassell" <cassell.david@EPAMAIL.EPA.GOV>
Subject:   Re: proc mixed model with ar(1) processes inter and intra blocks
In-Reply-To:   <20050524234352.21859.qmail@web32202.mail.mud.yahoo.com>
Content-type:   text/plain; charset=US-ASCII

Dale McLerran <stringplayer_2@yahoo.com> replied: > I still must disagree. If you were to allow a burn-in of, say, > 50 iterations before you start outputting u or e values, then > your approach would generate data which follow an AR(1) process. > Note that your code produces > > u0 ~ N(0,1), > > /* First iteration */ > u1=rho*u0 + err > err~N(0,1) > u1~N(0,1+rho**2) > output u1 > assign u0=u1 > > /* Second iteration */ > u1=rho*u0 + err > err~N(0,1) > u1~N(0,1+rho**2+rho**4) > output u1 > assign u0=u1 > > /* Third iteration */ > u1=rho*u0 + err > err~N(0,1) > u1~N(0,1+rho**2+rho**4+rho**6) > output u1 > assign u0=u1 > > ... > > /* K-th iteration */ > u1=rho*u0 + err > err~N(0,1) > u1~N(0,1+rho**2+rho**4+rho**6+...+rho**(2K)) > output u1 > assign u0=u1 > > > The difference between variance of u1 at iteration m and variance > of u1 at iteration m+1 is > > V(u1{m+1}) - V(u1{m}) = rho**(2*(m+1)) > > > For m sufficiently large, the difference in variances does become > negligible. However, for the first two iterations (m=1), the > difference in variance is rho**4. That is not small unless > rho is near zero. The following code shows the variance function > by iteration for rho=0.7. > > data var; > rho=0.7; > V=1; > do iter=1 to 50; > V = 1 + (rho**2)*V; > output; > end; > run; > > proc print data=var; > by rho; > id rho; > var iter V; > format V 10.8; > run; > > > The variance approaches a stable value by about the 5th > iteration. However, for iterations 1 through 3, the variance > is quite far from the stable value. > > Now, to demonstrate the validity of your AR(1) generation process > you use the code > > data test; > seed=1234579; u0 = rannor(seed); > do iter=1 to 100000; > /* u{t} = rho*u{t-1} + err */ > u1 = 0.7*u0 + rannor(seed); > output; > u0=u1; > end; > run; > title "Covariance matrix: errors constructed as u{t}=rho*u{t-1} + > err"; > > proc corr data=test cov; > var u:; > run; > > > This is a very long process, and certainly the variance is stable > for most of the 100000 iterations. If you were using a long > sequence like this, then I would have little concern with assuming > that ALL the data follow the same AR(1) process. However, you have > relatively short processes (20 or 40 iterations). The first few > terms of a short process could have strong effect on your parameter > estimates. > > Note that the simulation which I previously posted shows the > variance structure generated by your data generation process > through five iterations. That is, the generation model > > u0 = rannor(seed); > u1 = 0.7*u0 + rannor(seed); > u2 = 0.7*u1 + rannor(seed); > u3 = 0.7*u2 + rannor(seed); > u4 = 0.7*u3 + rannor(seed); > u5 = 0.7*u4 + rannor(seed); > > is exactly the same as > > u0~N(0,1) > > /* First iteration */ > u1=rho*u0 + err > err~N(0,1) > u1~N(0,1+rho**2) > output u1 > assign u0=u1 > > /* Second iteration */ > u1=rho*u0 + err > err~N(0,1) > u1~N(0,1+rho**2+rho**4) > output u1 > assign u0=u1 > > ... > > /* Fifth iteration */ > u1=rho*u0 + err > err~N(0,1) > u1~N(0,1+rho**2+rho**4+rho**6+rho**8+rho**10) > output u1 > assign u0=u1 > > > It most certainly does demonstrate that for the first few > iterations, your process is NOT AR(1). But the alternative > process in which we have u{i}=rho*u{i-1} + sqrt(1-rho*rho)*err > DOES FOLLOW THE AR(1) PROCESS FROM ITERATION 1. > > IF YOU WANT YOUR DATA GENERATION MODEL TO BE AR(1) FOR ALL > OBSERVATIONS, THEN YOU NEED TO MODIFY YOUR CODE. As I stated > previously, one way to modify your code so that all the data > follow an AR(1) process is to allow a burn-in period of some M > iterations before you start accepting data as following the > AR(1) process. Better yet is to use > > u0 = err{0}; > do iter=1 to n; > u1 = rho*u0 + (1-rho*rho)*err{iter}; > output; > u0=u1; > end; > > where err{j}~N(0,V). This follows the AR(1) process from the > very first iteration.

I'm going to side with Dale on this one. Mostly. :-)

I agree with Dale that if you want the generated stream to be correct from the first iteration, you need to do this:

u0 = rannor(seed); do i=1 to k; u1 = rho*u0 + sqrt(1-rho*rho)*rannor(seed); output; u0 = u1; end;

This is the formulation that I learned in intro time series courses, back when dinosaurs still ruled the earth. Shiling is writing the time series in a traditional manner, using the duality of AR processes as infinite-order MA processes. But that's not particularly efficient for process generation. It *is* really useful once you start using the backshift operator and you want to do a little theory. I think that Dale's formulation will make the generation process a lot easier once Shiling moves on up to more complex time series models, with higher-order ARIMA(p,d,q) models.

But the idea of 'burn-in' is not new. It's a fundamental part of some statistical tools. The Gibbs sampler comes to mind, for one. So Shiling could make this work. I would recommend a burn-in period of at least M=10 for this case, although the computer cost is so small that there's really no point in not using a burn-in period of, say, M=1000. If Shiling builds a much more complex time series model, he may want to use an M of 100 or so, just to address any long-term hysteresis of his models.

So here's my version of Shiling's code:

u0 = rannor(seed); M = 1000; /* just pick something big enough */ do i=1 to k+M; u1 = rho*u0 + rannor(seed); if i > M then output; u0 = u1; end;

With this sort of minor mod, Shiling's code will do just what he wants.

Just sticking my nose in, David -- David Cassell, CSC Cassell.David@epa.gov Senior computing specialist mathematical statistician


Back to: Top of message | Previous page | Main SAS-L page