Date: Fri, 8 May 1998 00:08:36 -0700
Reply-To: Andrew James Llwellyn Cary <ajlcary@CARYCONSULTING.COM>
Sender: "SAS(r) Discussion" <SAS-L@UGA.CC.UGA.EDU>
From: Andrew James Llwellyn Cary <ajlcary@CARYCONSULTING.COM>
Organization: Cary Consulting Services
Subject: Re: Simulation Problem
Use proc plan to generate your random samples without replacement and then
use proc univariate to construct the sample medians.
The code below generated 5000 samples of 35 drawn from 200 legislators
then calculates a median for each sample and outputs it to a dataset.
On my P166 w/48MB RAM this took just under 10 seconds to complete.
Specify a seed to allow reproducability.
%LET SEED= 12345;
PROC PLAN SEED=&SEED;
FACTOR
sample = 5000 ordered
memb_id of 200 random
/noprint;
output out=samples;
run;
proc univariate data=samples noprint;
by sample;
var memb_id;
output out=sampmedn median=median;
run;
The samples dataset can be merged with your list of legislators to provide
draws for each sample. Assuming your legislators are described in a SAS
dataset named "legislat", This can be done handily in either a datastep
or PROC SQL; Untested datastep and proc sql code.
PROC SORT DATA=SAMPLES;
BY MEMB_ID;
PROC SORT DATA=LEGISLAT;
BY MEMB_ID;
DATA SAMPLES2;
MERGE SAMPLES LEGISLAT;
BY MEMB_ID;
PROC SORT DATA=SAMPLES2;
BY SAMPLE;
or the equivalent PROC SQL statements
PROC SQL;
CREATE TABLE SAMPLES2 AS
SELECT L.*,S.SAMPLE
FROM SAMPLES AS S, LEGISLAT AS L
WHERE L.MEMB_ID=S.MEMB_ID
ORDER BY SAMPLE,MEMB_ID;
-----
Andrew JL Cary,
Chief Curmudgeon
Cary Consulting Services, Newark CA 94560
http://www.caryconsulting.com
Garry Young wrote in message <355135DE.61A@showme.missouri.edu>...
>I'm working on a difference between medians problem where I want to
>compare the median of an actual legislative committee with a series of
>simulated committees drawn randomly from the larger legislature. So
>the basic problem is to create a series of randomly drawn committees.
>As shown below I did this by assigning each individual in the
>legislature a random number, sort, then take obs=35 (the number of
>members on the committee). Unfortunately I've come across two problems:
>
>(1) The program is slow;
>
>(2) The program always uses up all memory -- thus locking up the machine
>-- by about the 700th iteration.
>
>Regarding (1) each iteration takes about 2 seconds. I need to do 5000
>or so iterations and about twenty different simulations so this is a
>problem. Of course, until it can be solved, problem (2) makes problem
>(1) a moot point.
>
>I'm doing this on a Pentium 166 with 32 meg. I tried a Proc Dataset
>Delete;. This helped some but not alot. I also tried running it in
>batch mode with nolog. Problem (2) struck at about the same time.
>
>Any suggestions would be appreciated. Thanks.
>
>The basics of the code I'm using follows:
>
>data one;
>infile statement
>input statement
>
>%Macro Monte;
> %Let I = 1;
> %Do %While (&I<1000);
>
>data M_two;
>set one;
>x = ranuni(0);
>proc sort; by x;
>run;
>
>data M_three;
>set M_two (obs = 35);
>proc univariate data = M_three noprint; var var1;
> output out = D_med median=D_Med;
>run;
>
>
>data M_four;
>proc append base = sasdata.Ag90a data = D_Med;
>run;
>
>%Let I = (&I + 1);
> %End ;
>
>
>%Mend Monte;
>
>
>data two;
>set one;
>x = ranuni(0);
>proc sort; by x;
>run;
>
>data three;
>set two (obs = 35);
>proc univariate data = three noprint; var var1;
> output out = D_med1 median=D_Med;
>run;
>
>libname statement;
>
>DATA sasdata.Ag90a;
>set D_Med1 ;
>run;
>
>data run;
>%Monte;
>
>data finish;
>set sasdata.Ag90a;
>/* proc sort; by median; */
>proc print;
>run;
>
>
>
>--
>------------------------------------------------------------------
>Garry Young Phone: (573) 882-0056
>Assistant Professor FAX: (573) 884-5131
>Dept. of Political Science Email: polsgy@showme.missouri.edu
>113 Professional Bldg. Web: http://www.missouri.edu/~polsgy
>University of Missouri
>Columbia, MO 65211
>------------------------------------------------------------------