LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous (more recent) messageNext (less recent) messagePrevious (more recent) in topicNext (less recent) in topicPrevious (more recent) by same authorNext (less recent) by same authorPrevious page (May 2005, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Wed, 11 May 2005 10:54:33 -0700
Reply-To:     cassell.david@EPAMAIL.EPA.GOV
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         "David L. Cassell" <cassell.david@EPAMAIL.EPA.GOV>
Subject:      Re: Need help in statistical modelling
In-Reply-To:  <1115811474.930889.193830@g49g2000cwa.googlegroups.com>
Content-type: text/plain; charset=US-ASCII

kumsaa@hotmail.com wrote: > I have a project aimed to assess the determinants of allergen level in > individuals. I have about 22 centres in 10 countries with about 200 > subjects in each. From every subject blood allergen level was taken > once. Detailed questionnaire about household, pet keeping and personal > history is available. I want to determine the influence of pet keeping > on specific allergen to cat and house dust mite. My outcomes are > continuous (not normally distributed) and my predictors are a mix of > continuous, binary and categorical variables. There is a possibility of > correlation within centres. One important issue as well is the fact > that my outcome is highly skewed, in spite of transformation. > Additionally, the study centres differ in many ways, such as climate, > life style, income and so on. I would be grateful if one could give me > some tips as to what statistical model to use in SAS 8. > To summarize > > Centres 22 from 10 different countries > Observations 4000 ~200 /centre > Outcome continuous, right skewed > Explanatory variables binary, continuous and categorical > > > Individual variables: gender, presence of cat in household, smoking, > season age of mattress, storey of building; > > Centre level variables: prevalence of cat ownership in community

I see that you have already received some excellent advice. So let me throw some additional complexities your way.

You say that you have 22 centers from 10 different countries. How were these centers selected? Are these just all the participating centers?

If your centers represent a probability sample of centers with a known sample design structure, then you have survey data, and you should be tackling this problem using PROC SURVEYREG (or maybe even PROC SURVEYLOGISTIC instead, depending on your data).

The fact that your outcome variable is skewed is not really relevant, as others have carefully pointed out. It becomes even less relevant in PROC SURVEYREG, since this is a design-based analytical tool, instead of a model-based one.

HTH, David -- David Cassell, CSC Cassell.David@epa.gov Senior computing specialist mathematical statistician


Back to: Top of message | Previous page | Main SAS-L page