Date: Mon, 8 Jan 2001 10:52:49 -0800
Reply-To: Cassell.David@EPAMAIL.EPA.GOV
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "David L. Cassell" <Cassell.David@EPAMAIL.EPA.GOV>
Subject: Re: Help: PROC FREQ with 1000 variables..
Content-type: text/plain; charset="us-ascii"
Victor,
Ya Huang has already submitted a much nicer solution than I had in
mind but I have a question. What are you doing that you need to look
at 1000 variables and pick out the "best" p-values? This is bad from
a statistical point of view, and I strongly recommend that you re-think
your approach. Do you have a statistician with whom you can consult
on this? Bear this in mind. If you have 1000 variables, all of which
have NO real relationship with Y, you will get on average 50 variables
with p-values smaller than .05 and hence you would end up assuming that
you have 50 significant relationships when in fact you have NONE. This
is a common error in naive analyses, and you really must guard against
this in your analysis.
David
--
David Cassell, OAO Corp. Cassell.David@epa.gov
Senior computing specialist
mathematical statistician
|