LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (February 1997, week 3)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Fri, 21 Feb 1997 09:24:16 GMT
Reply-To:     schick@hrz.uni-marburg.de
Sender:       "SAS(r) Discussion" <SAS-L@UGA.CC.UGA.EDU>
From:         Arnold Schick <schick@HRZ.UNI-MARBURG.DE>
Organization: HRZ Uni Marburg
Subject:      Re: random selecting obs. from dataset
Content-Type: text/plain; charset=us-ascii

Hello,

Kunling Lu asked:

>Does anybody have a macro that selects random number of observations from a SAS >dataset? I need to pick up 50 obs. from a SAS dataset of about 4,000. What >random generating function do I go? Thank you for help.

for that purpose of a random choice of observations from a SAS data set, there were developed a lot of SAS code and sent on SAS-L.

Well, the SAS code that I've written, is situated on the Web there:

http://staff-www.uni-marburg.de/~schick/sasmacros/

And you can find on that site, the appended SAS macro here.

It returns a >random< selection; this means that the exact number of observations is accidental. Often, there equals the exact defined pct parameter with the choice. Perhaps (by chance), more than one run are to perform, to return a desired choice.

At the end of the sent macro, there is an example for calling.

Regards,

Arnold Schick

-------------------------------------------------------please-cut-here--- /* This macro selects from dataset IN n PCT data and stores the choosen data into OUT. No duplicate observation will be selected.

IN, OUT are macro call parameter for the dataset name PCT is a macros call parameter for the numeric value of pct. _RESERVE is an internal dataset name which does create this macro

Note: no default macro parameters are in use. Selection of a lower number of obersations results inexactly.

Written: January 16, 1996 Author: Arnold Schick, University of Marburg/Germany */

options nosource; %macro choice(in,out,pct); options nonotes nomprint nosymbolgen nostimer nosource; data &out (drop=stored any_more res_fact) _reserve (drop=stored any_more res_fact); set &in nobs=N end=last; which = _N_; if round(log(0.66-1/(0.01*&pct-1))*ranuni(0),1) then do; stored+1; output &out; end; else do; if &pct > 90 then output _reserve; else if &pct < 84 then res_fact = 0.0065; else res_fact = 0.0020; if round(log(0.66-1/(res_fact*&pct-1))*ranuni(0),1) then output _reserve; end; any_more = stored - round(&pct/100*N,1); if last then call symput('diff',any_more); run; %if &diff > 0 %then %do; data &out (drop=i); set &out nobs=N; if i < &diff then if round(log(0.66-1/(&diff/N-1)) *ranuni(0),1) then do; i+1; delete; end; run; %end; %else %if &diff ^= 0 %then %do; data _reserve (drop=i); set _reserve nobs=N; if i < N+&diff then if round(log(0.66-1/((N+&diff) /N-1))*ranuni(0),1) then do; i+1; delete; end;

run; data &out; update &out _reserve; by which; run; %end;

proc datasets nolist; delete _reserve; quit; options notes; data &out; set &out; run; options stimer source;

%mend choice; options source;

*Example;

data one; do h=1 to 4000; p=h+h; output; end; run;

%choice(one,two,1.25); *selects ~50 (1.25%) OBS from the data;

proc print data=two; run; *prints dataset TWO;

%choice(one,two,1.25); %choice(one,two,1.25); %choice(one,two,1.25);


Back to: Top of message | Previous page | Main SAS-L page