Date: Tue, 20 Dec 2005 22:15:57 -0800
Reply-To: David L Cassell <davidlcassell@MSN.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: David L Cassell <davidlcassell@MSN.COM>
Subject: Re: PPS sampling
Content-Type: text/plain; format=flowed
>I am going to sample from a population with a positive variable as the
>sampling weight i.e, I want to sample by PPS method. but as you know the
>algorithms in sas for pps are restricted ,that is, the observation weight
>should not be grater than 1/sampsize. But I want to sample from the
>population and for some of stratum the sample weight may exceeds the
>1/sampsize. how can I do this. I don't want to use method such as pps_seq
>or... which are drawing with replacement or minimum replacement.
>it is a very usual case in application!!!
>the speed is not in consideration.
Okay, I'm concerned. I cannot tell whether the problem is one of difficulty
expressing yourself, or one of trouble with the sampling concepts. For lack
any idea which one is right, I'm going to assume that you have expressed
meaning exactly. So I'll speak to the problems with what you have written,
even if these are not what you meant.
 Sampling with strata is not the same as sampling with no strata. You
re-structure your frame of reference accordingly. If you want to sample PPS
each stratum, then you have to treat each stratum separately when you think
about features like this. So you only need to think about the
the sample size, and the relative weights within each stratum. Separately.
 When you do your PPS sampling, you use a SIZE variable. This is not a
or a relative weight. It is a multiplier. In fact, it turns out that your
will be a constant times your inclusion probability, and will be a constant
over the sampling weight. So your multiplier is very different from your
weight, and as your multiplier gets larger, your sampling weight goes down.
 Your statement "for some of stratum the sample weight may exceeds the
1/sampsize" seems to indicate a mistake. It's not the sample weight that
here. It is the *relative* weight. That's your sampling weight divided by
of samplnig weights. In your stratum of interest. If your weight is
the sum of all the weights in the stratum divided by the sample size for the
stratum, AND you want to sample without replacement, then you have a
Is this going to be a problem once you split this out by strata and
 If you still have the above problem, then think in terms of the task.
want to pick all such records with 'large' relative weights with 100%
Or do you want to pick them with just a high degree of probability? In the
case, you have what we call 'certainty sampling'. Either way, you need to
at the CERTSIZE and MAXSIZE (and maybe even MINSIZE) options in PROC
David L. Cassell
3115 NW Norwood Pl.
Corvallis OR 97330
FREE pop-up blocking with the new MSN Toolbar – get it now!