Date: Thu, 6 Jun 1996 16:30:56 GMT
Reply-To: Netnews Server <NETNEWS@AMERICAN.EDU>
Sender: "SAS(r) Discussion" <SAS-L@UGA.CC.UGA.EDU>
From: Netnews Server <NETNEWS@AMERICAN.EDU>
Organization: Fred Hutchinson Cancer Research Center
Subject: Re: need HELP! with what should be a simple programming problem
In <199606060255.CAA13239@mag-net.co.uk>, John Whittington
<johnw@MAG-NET.CO.UK> writes:
>Sara, there are many ways to solve your problem (merely getting rid of the
>default library WORK. would help a lot) but the simplest is probably to use
>Proc APPEND to concatenate your datsets together into your dataset ALL, as
>you go along. The following should work for any number of trials, if you
>increase the figure in the %DO statement from 32 to whatever you want:
>
>proc datasets ; delete ALL ; run ;
>%MACRO LOOP;
> %DO T=1 %TO 32;
> DATA;
> DO I=1 TO 20;
> X1= 13 + 2*I;
> E= 0+ NORMAL(0)*1;
> Y=20 +5*X1 +E;
> OUTPUT;
> END;
> PROC REG OUTEST=S;
> MODEL Y=X1/ NOPRINT;
> OUTPUT;
> proc append out=ALL new=s ;
> %END;
>%MEND LOOP;
>
>%LOOP;
>PROC PRINT data=ALL ; run ;
>PROC MEANS;
>run ;
>
..
>2...Use of an explicit seed for your random function - e.g. NORMAL(12345678)
>would enable you to re-run the program and get the same answers each time;
>changing the seed would give you a different series of trials.
>
Agreed!!! However, you need to use some tricks in order to avoid generating
the same "random" data from one trial to the next. You must use the CALL
RANNOR routine in order to return a seed for the next random number
generation. This is necessary across datasteps, though not necessary within
a datastep. The following code illustrates:
/* John Whittington's revised code (with specified seed. This */
/* would generate the same random number stream for all trials. */
proc datasets ; delete ALL ; run ;
%MACRO LOOP;
%DO T=1 %TO 32;
DATA;
DO I=1 TO 20;
X1= 13 + 2*I;
E= 0+ NORMAL(12345678)*1;
Y=20 +5*X1 +E;
OUTPUT;
END;
PROC REG OUTEST=S;
MODEL Y=X1/ NOPRINT;
OUTPUT;
proc append out=ALL new=s ;
%END;
%MEND LOOP;
%LOOP;
PROC PRINT data=ALL ; run ;
PROC MEANS;
run ;
/* CALL RANNOR and CALL SYMPUT used to control seeds so that the */
/* random number generator is updated from one trial to the next. */
%let seed=12345678;
proc datasets ; delete ALL ; run ;
%MACRO LOOP;
%DO T=1 %TO 32;
DATA;
seed = &seed; /* Initialize seed with current macro variable. */
DO I=1 TO 20;
X1= 13 + 2*I;
call rannor(seed,error);
E= 0+ error*1; /* This line is probably unnecessary. */
Y=20 +5*X1 +E;
OUTPUT;
END;
call symput("seed",seed); /* Update macro variable for next trial. */
run;
PROC REG OUTEST=S;
MODEL Y=X1/ NOPRINT;
OUTPUT;
proc append out=ALL new=s ;
%END;
%MEND LOOP;
%LOOP;
PROC PRINT data=ALL ; run ;
PROC MEANS;
run ;
Of course, you could do as Andrew Carey has suggested: have just one
data step in which you generate all the data which are required for all
trial and then use BY variable processing to obtain separate estimates
for each trial. In that case you do not need to maintain control over
the random number seed, only initialize the random number generator with
a specified seed. That is satisfactory (and even preferrable) for some
problems. It is best under the following conditions:
1) The product of the number of trials by the number of observations
per trial does not result in a dataset that is too large to store
and manipulate on your system.
2) You are strictly using estimation procedures which allow BY group
processing.
If either of these conditions is not met, then you must use the methods
shown above to generate your data separately for each trial.
Dale McLerran
Fred Hutchinson Cancer Research Center
1124 Columbia Street
Seattle, WA 98104