LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (December 2005, week 3)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Tue, 20 Dec 2005 22:15:57 -0800
Reply-To:     David L Cassell <davidlcassell@MSN.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         David L Cassell <davidlcassell@MSN.COM>
Subject:      Re: PPS sampling
In-Reply-To:  <200512201542.jBKF0qVL012059@mailgw.cc.uga.edu>
Content-Type: text/plain; format=flowed

mehdi_soleymani@SOFTHOME.NET wrote: >I am going to sample from a population with a positive variable as the >sampling weight i.e, I want to sample by PPS method. but as you know the >algorithms in sas for pps are restricted ,that is, the observation weight >should not be grater than 1/sampsize. But I want to sample from the >population and for some of stratum the sample weight may exceeds the >1/sampsize. how can I do this. I don't want to use method such as pps_seq >or... which are drawing with replacement or minimum replacement. >it is a very usual case in application!!! >the speed is not in consideration.

Okay, I'm concerned. I cannot tell whether the problem is one of difficulty in expressing yourself, or one of trouble with the sampling concepts. For lack of any idea which one is right, I'm going to assume that you have expressed your meaning exactly. So I'll speak to the problems with what you have written, even if these are not what you meant.

[1] Sampling with strata is not the same as sampling with no strata. You have to re-structure your frame of reference accordingly. If you want to sample PPS within each stratum, then you have to treat each stratum separately when you think about features like this. So you only need to think about the sub-population size, the sample size, and the relative weights within each stratum. Separately.

[2] When you do your PPS sampling, you use a SIZE variable. This is not a weight, or a relative weight. It is a multiplier. In fact, it turns out that your SIZE variable will be a constant times your inclusion probability, and will be a constant times one over the sampling weight. So your multiplier is very different from your sampling weight, and as your multiplier gets larger, your sampling weight goes down.

[3] Your statement "for some of stratum the sample weight may exceeds the 1/sampsize" seems to indicate a mistake. It's not the sample weight that matters here. It is the *relative* weight. That's your sampling weight divided by the sum of samplnig weights. In your stratum of interest. If your weight is greater than the sum of all the weights in the stratum divided by the sample size for the stratum, AND you want to sample without replacement, then you have a problem. Is this going to be a problem once you split this out by strata and re-consider things?

[4] If you still have the above problem, then think in terms of the task. Do you want to pick all such records with 'large' relative weights with 100% certainty? Or do you want to pick them with just a high degree of probability? In the first case, you have what we call 'certainty sampling'. Either way, you need to look at the CERTSIZE and MAXSIZE (and maybe even MINSIZE) options in PROC SURVEYSELECT.

HTH, David -- David L. Cassell mathematical statistician Design Pathways 3115 NW Norwood Pl. Corvallis OR 97330

_________________________________________________________________ FREE pop-up blocking with the new MSN Toolbar – get it now! http://toolbar.msn.click-url.com/go/onm00200415ave/direct/01/


Back to: Top of message | Previous page | Main SAS-L page