| Date: | Tue, 22 Jan 2002 13:40:16 -0800 |
| Reply-To: | Cassell.David@EPAMAIL.EPA.GOV |
| Sender: | "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU> |
| From: | "David L. Cassell" <Cassell.David@EPAMAIL.EPA.GOV> |
| Subject: | Re: weighted data and SAS (long) |
|
| Content-type: | text/plain; charset=us-ascii |
|---|
manon girard <mansof@VIDEOTRON.CA> wrote [in part]:
> I encountered some problems with weighted data. I usually work with
> populational data which contains weights and "design effects" (or
variance
> inflation due to sampling design - like stratification). In fact, each
> sampled individual in the dataset is given a weight such that this guy
> represents X persons in the total targeted population.
So this is not just 'weighted' data, but actually a probabilistic sample
from a defined target population. [By the way, I don't like the term
'variance inflation', since we don't really inflate the variance, but
use the sample design information to get a *correct* variance. And the
usual simple random sample variance estimator is not always a good
estimate
of the true variance. So you shouldn't be using PROC GLM or PROC ANOVA
anyway.]
> Usually I take sampling weights to readjust the sample size. By this I
mean
> to get n the same number of my sample size but individual may
represents
> less than 1 person (my weights would contain a fraction of
individual).
Like I said, you should not be doing your analysis this way. There are
risks of embarrassing errors from this kind of approach to sample survey
data.
> Recently, I tried to make multiple comparisons with SAS-GLM (Anova)
and all
> the homogeneity of variance tests (underlying assumption of multiple
> comparisons in ANOVA) (e.g. : HOVTEST=BF) could not be used with
weights.
> What can I do ? Bootstrapping ?
No, I recommend the survey analysis procs in SAS version 8. Try using
PROC SURVEYREG and PROC SURVEYMEANS instead. They are both designed to
handle survey data of the complexity you describe. They won't handle
everything
and they aren't designed to handle all possible survey designs, but they
are a reasonable starting point.
David
--
David Cassell, CSC
Cassell.David@epa.gov
Senior computing specialist
mathematical statistician
|