LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (August 2009, week 5)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Sun, 30 Aug 2009 21:19:19 -0700
Reply-To:   Dale McLerran <stringplayer_2@YAHOO.COM>
Sender:   "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:   Dale McLerran <stringplayer_2@YAHOO.COM>
Subject:   Re: Fastest Steps for Simulating: Anderson-Darling Goodness of Fit test for Non-typical distn
In-Reply-To:   <6eca73440908302014p298b9358l52ef2e829617d858@mail.gmail.com>
Content-Type:   text/plain; charset=iso-8859-1

One million times? Why? I really think that is overkill. I would try to cover more parameter combinations if it were me.

But you should be able to use a single data step to generate A-D statistics for all of your parameter combinations. The code below should be pretty efficient.

%macro AD(N=); do i=1 to &N; /* The next line needs completion with the appropriate G */ _x&N{i} = G(ranuni(6923479,S)); end;

call sortn(of _X&N(*)); mu = mean(of x1-x&N); var = var(of x1-x&N); sd = sqrt(var); S=0; do i=1 to &N; S + ((2*i - 1)/&N) * (log(cdf('normal',x{i},mu,sd)) + log(1 - cdf('normal',x{&N+1-i},mu,sd))); end; AD = -&N - S; output AD_&N; %mend;

/* Generate 10000 samples of same size (N=9 in this case) following */ /* a normal distribution and compute AD statistic for each sample. */ data AD_50 AD_100 AD_200 AD_300; array _x50 {50} x1-x50; array _x100 {100} x1-x100; array _x200 {200} x1-x200; array _X300 {300} x1-x300; do S={S1 S2 S3}; /* This line needs correct specification */ do rep=1 to 10000; %AD(N=50) %AD(N=100) %AD(N=200) %AD(N=300) end; end; keep S AD; run;

/* Determine probability of observed data */ /* using simulated data AD distribution. */ proc sort data=AD_50; by S AD; run;

proc sort data=AD_100; by S AD; run;

proc sort data=AD_200; by S AD; run;

proc sort data=AD_300; by S AD; run;

The above is untested code and should be tested with a small number of replicates before using it for a final simulation. Also, there will obviously need to be some final step where you determine the quantiles of the AD statistics.

Dale

--------------------------------------- Dale McLerran Fred Hutchinson Cancer Research Center mailto: dmclerra@NO_SPAMfhcrc.org Ph: (206) 667-2926 Fax: (206) 667-5977 ---------------------------------------

--- On Sun, 8/30/09, OR Stats <stats112@GMAIL.COM> wrote:

> From: OR Stats <stats112@GMAIL.COM> > Subject: Fastest Steps for Simulating: Anderson-Darling Goodness of Fit test for Non-typical distn > To: SAS-L@LISTSERV.UGA.EDU > Date: Sunday, August 30, 2009, 8:14 PM > This is good. I am ready now to run a large scale simulation. What that > means is that I want to compute the goodness of fit statistic for (M x > S) groups and n times each group. > > Group defined by (m,s); S = s1 s2 s3 and M = 50 100 200 300. Basically, > M is my different sample sizes for which I am testing their fit to > function G(random#,s) (i.e., inverse distribution). I would like to run > each group 1 million times. For each s group, by generating random > numbers just by 300 x 1million times, I'll have enough simulated data > y(s) to use for the largest and smaller sample sizes. > > My final column space would look like > i ranuni y_s1=G(ranuni,s1) y_s2=G(ranuni,s2) y_s3=G(ranuni,s3) > 1 > . > . > . > m > All rows in the above table would be used to caculate function f_s1, > f_s2, f_s3 (i.e., AD). This last step is repeated 1 Million times. > > Can we do this in one to two DATA STEPS? Which syntax would be fastest > since we have to generate 300 Million random numbers, from which we would > split the sample by 1 Million disjoint sets that we would then compute a > statistic 1 Million times using 50, 100, 200, and 300 rows of data at > each iteration for three different values of s (s1, s2, s3)? > > Thank Q!


Back to: Top of message | Previous page | Main SAS-L page