LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (August 2002, week 3)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Wed, 21 Aug 2002 14:27:37 -0700
Reply-To:     Cassell.David@EPAMAIL.EPA.GOV
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         "David L. Cassell" <Cassell.David@EPAMAIL.EPA.GOV>
Subject:      Re: surveyselect
Content-type: text/plain; charset=us-ascii

Martine Ferguson <ferguson_m@BLS.GOV> wrote: > I wish to use the PROC SURVEYSELECT procedure to program a PPS Cluster > Sample without Replacement (ie each cluster is selected without replacement > proportional to the size of that cluster). > The problem I have is that I can only select up to a certain number of > clusters and when I have reached my target number of units, I wish to stop > the procedure. For example, suppose people are split up into counties (what > I will call clusters) and I wish to sample a total of 100 people. I want to > keep selecting counties PPS without replacement until I have reached my > target sample size of 100. Now, I do not know ahead of time how many > counties to sample because it all depends on which county is selected > first, the randomness of sampling, and will vary with each sample I select > (in other words I cannot use the sampsize= option because I do now know in > advance what that number will be). Thefore, I cannot say, select say, 5 > counties because there may not be 100 people in these 5 counties or there > may be over 100 people in these 5 counties. I do not know how to do this > using SURVEYSELECT. Does anybody know of any way by which I can use PROC > SURVEYSELECT and tell the procedure to stop running once I have sampled 100 > people?

Oversample, preserving the order in which the clusters appear. Have PROC SURVEYSELECT pull out enough clusters that you can be sure you will get enough people. Then sample in order, until you have reached your limit (100 people, in your example). Stop there. You now have your desired cluster sample. You have selected K clusters, K unknown in advance, such that you have sampled WR PPS.

> In addition, I want to run this procedure for 1000 iterations using a macro > (ie collect 1000 different samples) and PROC SURVEYSELECT takes days to run > on my computer. Do you know of a way to shorten the processing time? > I appreciate any insight anyone may have on this and I thank you in advance > for any help you can provide me with.

Don't do it with a macro. SURVEYSELECT has to start up and then re-read the whole data set in each iteration. Instead, try using the REP= option in the PROC SURVEYSELECT statement to get your 1000 replicates.

HTH, David -- David Cassell, CSC Cassell.David@epa.gov Senior computing specialist mathematical statistician


Back to: Top of message | Previous page | Main SAS-L page