| Date: | Mon, 13 Jan 2003 16:30:58 -0500 |
| Reply-To: | Peter Flom <flom@NDRI.ORG> |
| Sender: | "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU> |
| From: | Peter Flom <flom@NDRI.ORG> |
| Subject: | Re: Interpretation of Small Effect with Large N |
|
| Content-Type: | text/plain; charset=US-ASCII |
Bill,
I'm not sure what's going on here.
First, how did you estimate power? Especially in repeated measures, it
gets very tricky, but all power estimates must include sample size,
effect size, power, and alpha level. As you state it below, it's not
clear what you did....You have to supply three things, and the software
figures out the fourth. Usually, the software either computes power
from N, ES, and alpha, or else computes needed N from power, ES and
alpha. As you state it below, it seems, you give exact values for alpha
and power, which seems off.
Second, I am assuming that the 'group' effect is some treatment, and
that one of the groups is a control. Then, the results below seem to
imply (and I stress that these conclusions are tentative, based only on
the below) that the test is highly reliable both in the sense that
test-retest reliability is high, and in the sense that people's scores
don't vary much (in the absence of treatment) over the time period in
the study. That is, the values you entered into the power analysis for
autocorrelation are high.
HTH
Peter
Peter L. Flom, PhD
Assistant Director, Statistics and Data Analysis Core
Center for Drug Use and HIV Research
National Development and Research Institutes
71 W. 23rd St
New York, NY 10010
(212) 845-4485 (voice)
(917) 438-0894 (fax)
>>> Thompson Bill T Contr USAFSAM/FEC <Bill.Thompson@BROOKS.AF.MIL>
01/13/03 04:10PM >>>
Peter,
This is exactly the point I have been discussing with the investigators
here
regarding their project.
They have 4 groups of 20 subjects each measured at 4 time intervals.
The
dependent variable is simply visual acuity and their results for the
main
effect of time are statistically significant at p=.013, etasq=.20 and
power=.943. The result for time*group interaction is significant at
p=.001,
etasq=.10 and power .90.
Would this mean that the significant p values reflect the fact that
the
design was sensitive enough to detect reliable differences while the
contribution of the independent variables to the overall variability in
the
outcome variable(s) was weak?
-----Original Message-----
From: Peter Flom [mailto:flom@NDRI.ORG]
Sent: Monday, January 13, 2003 2:38 PM
To: SAS-L@LISTSERV.UGA.EDU
Subject: Re: Interpretation of Small Effect with Large N
Getting back to basics:
Presume a totally fair coin. Then, as N grow, the probability of
getting EXACTLY N/2 heads shrinks. But the probability of getting
APPROXIMATELY N/2 heads grows. The chance of getting a statistically
significant result with a totally fair coin is .05 (or whatever value
is
chosen) regardless of N; but the difference between .5 and the
proportion of heads which will give a significant result shrinks as N
grows.
Thia is the basic reason why the reification of siginficance testing
is
a bad idea, at least in most cases. The p value tests whether these
results are likely to have happened by chance, given that the null
hypothesis is true (e.g., the coin is fair). But what we are usually
interested in is whether it's likely that the null hypothesis is true,
given these results. The two are not equivalent. We are also usually
interested in effect size, not just statistical significance. If I
test a diet (say) on 100,000 people, then what is interesting to
people
who might follow the diet is NOT whether the average weight loss is
significant (it is very likely to be significant) but how large it is.
Peter
Peter L. Flom, PhD
Assistant Director, Statistics and Data Analysis Core
Center for Drug Use and HIV Research
National Development and Research Institutes
71 W. 23rd St
New York, NY 10010
(212) 845-4485 (voice)
(917) 438-0894 (fax)
>>> Bill Anderson <wnilesanderson@COX.NET> 01/13/03 03:10PM >>>
Actually, if a 'fair' coin is flipped 1,000,000 times, the probability
of
rejecting the null hypothesis is still 0.05 (or whatever alpha is
used.)
We know that there are physical differences between the head and tail
of a
coin, and it is quite believable that no coin is perfectly fair. So
if
we
flip a LOT of times, we figure to reject the null hypothesis of
fairness.
This is not due to an error in statistics; rather it is a reflection
of
the
lack of fairness in the coin.
Probably the simplest way to handle this is using the concept of
statistical
equivalence. Decide in advance what amount of difference really
matters,
and use the null hypothesis that the difference is this big or bigger.
Then
larger sample sizes will get you to the truth: if the difference does
not
matter, then large sample sizes will reject the null hypothesis, and
you
will correctly conclude equivalence. It may or may not happen that at
the
same time you have a statistically significant difference, but the
latter
situation is simply unimportant.
There is a lot of journal literature on the subject of equivalence,
although
it is still slow to get into elementary textbooks.
Bill Anderson
----- Original Message -----
From: "Bross, Dean S" <dean.bross@HQ.MED.VA.GOV>
Sent: Monday, January 13, 2003 8:14 AM
Subject: Re: Interpretation of Small Effect with Large N
> Some people sum up this finding as proving what seems to be
> one of the untaught laws of nature:
>
> All null hypotheses are false.
>
> I consider this to be just like one of the laws of thermodynamics.
>
> It is not an error in statistical methods.
>
> -----Original Message-----
> From: Tim Berryhill [mailto:tim@AARTWOLF.COM]
> Sent: Saturday, January 11, 2003 11:34 AM
> To: SAS-L@LISTSERV.UGA.EDU
> Subject: Re: Interpretation of Small Effect with Large N
>
>
> Would someone mind expanding on this? I usually use SAS for
COBOL-style
> business data processing, but back when I worked reasearch I noticed
that
if
> the sample size was large then the differences were ALWAYS
statistically
> significant. On the flip side, I know that if one counts heads and
tails
> for 1,000,000 flips of a balanced coin, the odds of getting exactly
500,000
> heads are quite low.
>
> Is there a mistake in choice of statistics which crops up with large
sample
> sizes? Is it a matter of violated assumptions which only shows up
when
you
> have large N?
> Just curious (in case I try to cure cancer),
> Tim Berryhill
>
> "Paul Thompson" <paul@wubios.wustl.edu> wrote in message
> news:3E19CB17.5070404@wubios.wustl.edu...
> > Just guessing here, but I bet you have boupcoup participants,
n'est
pas?
> >
> > Many many?
> >
> > Thompson Bill T Contr USAFSAM/FEC wrote:
> > > Can someone please explain to me or point in the right direction
for
> helping
> > > me understand how to interpret the results of a repeated
measures
> analysis
> > > where you have a small effect (.20) with strong power (.943).
> > >
> > > Thanks in advance,
> > >
> > > Bill
> >
>
|