Date: Wed, 11 May 2005 10:54:33 -0700
Reply-To: cassell.david@EPAMAIL.EPA.GOV
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "David L. Cassell" <cassell.david@EPAMAIL.EPA.GOV>
Subject: Re: Need help in statistical modelling
In-Reply-To: <1115811474.930889.193830@g49g2000cwa.googlegroups.com>
Content-type: text/plain; charset=US-ASCII
kumsaa@hotmail.com wrote:
> I have a project aimed to assess the determinants of allergen level in
> individuals. I have about 22 centres in 10 countries with about 200
> subjects in each. From every subject blood allergen level was taken
> once. Detailed questionnaire about household, pet keeping and personal
> history is available. I want to determine the influence of pet keeping
> on specific allergen to cat and house dust mite. My outcomes are
> continuous (not normally distributed) and my predictors are a mix of
> continuous, binary and categorical variables. There is a possibility
of
> correlation within centres. One important issue as well is the fact
> that my outcome is highly skewed, in spite of transformation.
> Additionally, the study centres differ in many ways, such as climate,
> life style, income and so on. I would be grateful if one could give me
> some tips as to what statistical model to use in SAS 8.
> To summarize
>
> Centres 22 from 10 different countries
> Observations 4000 ~200 /centre
> Outcome continuous, right skewed
> Explanatory variables binary, continuous and categorical
>
>
> Individual variables: gender, presence of cat in household, smoking,
> season age of mattress, storey of building;
>
> Centre level variables: prevalence of cat ownership in community
I see that you have already received some excellent advice. So let
me throw some additional complexities your way.
You say that you have 22 centers from 10 different countries. How
were these centers selected? Are these just all the participating
centers?
If your centers represent a probability sample of centers with a known
sample design structure, then you have survey data, and you should be
tackling this problem using PROC SURVEYREG (or maybe even PROC
SURVEYLOGISTIC instead, depending on your data).
The fact that your outcome variable is skewed is not really relevant,
as others have carefully pointed out. It becomes even less relevant
in PROC SURVEYREG, since this is a design-based analytical tool,
instead of a model-based one.
HTH,
David
--
David Cassell, CSC
Cassell.David@epa.gov
Senior computing specialist
mathematical statistician
|