|
On Apr 24, 8:29 am, ben.pow...@CLA.CO.UK wrote:
> Something like:-
>
> data a;
> do j=1 to 4;
> do i=1 to 30;
> id=1109+i;
> x=ranuni(1234);
> output;
> end;
> end;
> run;
>
> proc sort data = a;by j x;run;
>
> data a2;
> set a;
> by j;
> if first.j then count=0;
> count+1;
> if count<5;
> run;
Though the traditional approach always works, the surveyselect is
super both in strata sampling and in speed.
1774 data t1;
1775 do i = 1 to 200000;
1776 output;
1777 end;
1778 run;
NOTE: The data set WORK.T1 has 200000 observations and 1 variables.
NOTE: DATA statement used (Total process time):
real time 0.04 seconds
cpu time 0.04 seconds
1779
1780 %let n=1000;
1781
1782 data t2/view=t2;
1783 set t1;
1784 ran=ranuni(1234567);
1785 run;
NOTE: DATA STEP view saved on file WORK.T2.
NOTE: A stored DATA STEP view cannot run under a different operating
system.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
1786
1787 proc sort data=t2 out=t3;by ran;run;
NOTE: There were 200000 observations read from the data set WORK.T2.
NOTE: View WORK.T2.VIEW used (Total process time):
real time 0.21 seconds
cpu time 0.40 seconds
NOTE: There were 200000 observations read from the data set WORK.T1.
NOTE: The data set WORK.T3 has 200000 observations and 2 variables.
NOTE: PROCEDURE SORT used (Total process time):
real time 0.31 seconds
cpu time 0.59 seconds
1788
1789 data t3;
1790 set t3(obs=&n);
1791 drop ran;
1792 run;
NOTE: There were 1000 observations read from the data set WORK.T3.
NOTE: The data set WORK.T3 has 1000 observations and 1 variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
1793
1794 proc sql OUTOBS=&n;
1795 create table t4(drop=ran) as
1796 select *, ranuni(1234567) as ran
1797 from t1
1798 order by ran
1799 ;
WARNING: Statement terminated early due to OUTOBS=1000 option.
NOTE: Table WORK.T4 created, with 1000 rows and 1 columns.
1800 quit;
NOTE: PROCEDURE SQL used (Total process time):
real time 0.56 seconds
cpu time 0.40 seconds
1801
1802 proc surveyselect data=t1 out=t5 seed=1234567 n=&n;
1803 run;
NOTE: The data set WORK.T5 has 1000 observations and 1 variables.
NOTE: PROCEDURE SURVEYSELECT used (Total process time):
real time 0.03 seconds
cpu time 0.03 seconds
|