LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (January 2004, week 1)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Fri, 2 Jan 2004 20:09:55 GMT
Reply-To:   Arthur Tabachneck <art297@NETSCAPE.NET>
Sender:   "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:   Arthur Tabachneck <art297@NETSCAPE.NET>
Subject:   Re: Automation

Ougya,

From your example, I'm not certain I understand exactly what you are trying to accomplish. If it is simply to obtain a stratefied sample, why not just use proc surveyselect's strata feature?

If the "perc" variable does have to be considered, you could still use strata, but with an controlling file specified in the procedure's sampsize option. For example:

data bb; input category $ counts perc; perc=perc*10; cards; a 1 0.1 b 4 0.4 c 2 0.2 d 3 0.3 a 2 0.1 b 5 0.4 c 3 0.2 d 2 0.3 a 4 0.1 b 4 0.4 c 3 0.2 d 2 0.3 a 4 0.1 b 4 0.4 c 2 0.2 d 3 0.3 a 1 0.1 b 4 0.4 c 2 0.2 d 3 0.3 a 1 0.1 b 4 0.4 c 2 0.2 d 3 0.3 a 1 0.1 b 4 0.4 c 2 0.2 d 3 0.3 a 1 0.1 b 4 0.4 c 2 0.2 d 3 0.3 a 1 0.1 b 4 0.4 c 2 0.2 d 3 0.3 a 1 0.1 b 4 0.4 c 2 0.2 d 3 0.3 a 1 0.1 b 4 0.4 c 2 0.2 d 3 0.3 ; run; proc sort data=bb; by category; run; proc summary data=bb; var perc; by category; output out=bb2 mean(perc)=_nsize_ ; run; proc surveyselect data=bb method=srs sampsize=bb2 out=bb1; strata category notsorted; run;

Art ---------- "ougya" <jieguo01@yahoo.com> wrote in message news:fa55834f.0401021030.ef87ceb@posting.google.com... > Hi, everyone, > > I have a question here and appreciate your help in advance. > > Question/purpose: > I have a dataset aa, which has 3 variables and the structure looks > like > > category counts perc > a 1 0.1 > b 4 0.4 > c 2 0.2 > d 3 0.3 > > Now, I need to build a dataset bb (which has 100 observations) > retrieved from a big dataset (which has a variable 'category'). The > requirement is that the selected categories in bb must have the same > rate as 'perc' in aa. > > That is, if I use > proc freq data=bb; > tables category; > > it would give me > category percent > a 0.1 > b 0.4 > c 0.2 > d 0.3 > > Solution > > The silly step that I can have is > > proc surveyselect data=aa method=srs samsize=10 out=bb1; > where category='a'; > proc surveyselect data=aa method=srs samsize=40 out=bb2; > where category='b'; > proc surveyselect data=aa method=srs samsize=20 out=bb3; > where category='c'; > proc surveyselect data=aa method=srs samsize=30 out=bb4; > where category='d'; > > data bb; > set bb1 bb2 bb3 bb4; > run; > > > I am not happy with it because if the 'category' has 100 values, I > would have to repeat 100 times of surveyselect. > I wonder whether some experts can have a nice & concise way to > automatically retrieve 'perc' information for each 'category' from > data aa and use it to retrive observations from the big dataset and > finally build dataset bb. > > Thanks very much and happy new year! > > Jay


Back to: Top of message | Previous page | Main SAS-L page