|
Karen Masse <KMasse@PREVISIONMARKETING.COM> wrote [in part]:
> I am looking to take a two part reporting process (SAS processing then
Excel
> formatting) and put it into one process. Step 1 performs a proc means
by a
> class variable and determines whether the class mean for each variable
is
> statistically significant compared to the population mean. Thus, the
final
> dataset contains the following variables:
>
> _Label_
> _Freq_
> X01 - Xn...
> P01 - Pn...
You have already received two expert replies on the question you
asked. But I have a comment on the question you DIDN'T ask.
You are apparently making multiple comparisons here. This is a
well-known way to get into trouble if you are not being careful
about how you define your "p-value". Are you defining an experiment-
wise error rate, or are you just testing a bunch of differences?
After all, just due to variability, someone has to come out at the
maximum, and someone has to come out at the minimum. Are you dealing
with these outcomes in a careful, statistical manner, or are you
just assumin that the minimum and maximum are distributed in the
same way as everything else?
We on the list cannot tell what is actually being done in the
code. But if you would like to discuss the potential statistical
issues involved, you can write back to the list and clarify how
your differences are being computed, and how your p-values are
being calculated.
HTH,
David
--
David Cassell, CSC
Cassell.David@epa.gov
Senior computing specialist
mathematical statistician
|