Date: Tue, 7 Jun 2005 09:35:06 -0400
Reply-To: Paige Miller <paige.miller@KODAK.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Paige Miller <paige.miller@KODAK.COM>
Organization: Eastman Kodak Company
Subject: Re: ANOVA Questions
In-Reply-To: <c12d2b07acf0d877e28372975dfa4274@sbcglobal.net>
Content-Type: text/plain; charset=us-ascii; format=flowed
Elaine Pierce wrote:
> 1. My understanding was that for a fixed effect, multi-way ANOVA (as in
> ANCOVA) if you find a significant interaction (lack of parallelism) the
> analysis cannot proceed.
This is way too strong a statement. The analysis most certainly can
proceed. The analyst must be cautious about how to interpret main
effects; however the ANOVA is still correct and usable.
> Instead, you must stratify your data by levels
> of your interacting variable and analyze each level separately
> (although the reduction in power is a bummer).
Depending on the purpose of your analysis, you MAY want to (again
"must" is way too strong a word here) stratify your data.
> It looks like you can
> use the "slice" option in proc glm to do this for you. I was recently
> told not to do it that way, however: you just keep the significant
> interaction term in the model and analyze all your data together, just
> as you would do for mult linear regression. Now somebody's wrong here,
> and I don't want to look like a fool in my master's thesis. Any advice
> for me on this?
I'd be interested in hearing the reasons why SLICE is not
recommended any more. Seems to me that it answers certain questions
quite clearly, and if you have those particular questions about your
analysis, why not use it? There are two reasons I can think of: one
is that there are arguments against "post hoc" testing of your data;
and secondly that there may be a newer method that works better. But
I don't know if those are the reasons why you were told not to use it.
Certainly, if your goal is prediction rather than model
understanding, then you don't need SLICE at all. It won't help you.
> 2. It appears that for MLR, the residuals are defined as the observed
> values minus the value predicted by the regression equation, but for
> ANOVA, the residuals are the observed value minus the cell mean, am I
> right on this?
The predicted values for ANOVA are the cell means, so that makes the
ANOVA residual exactly equal to the regression residual. Good thing
it works out that way, because in reality ANOVA is the same as
regression, they are both least squares fits to data.
> I'm trying to remove my outliers (since I don't have
> access to the original data to look for errors, I will analyze with and
> without the outliers and note if there's a difference).
What does it mean to not have access to original data but still be
able to analyze with and without outliers? I can't understand how
this can be.
> All I have been
> shown is the keyword for the studentized and jacknife residuals
> (student. and rstudent.) - I don't know how to ask SAS for the
> standardized residual.
>
> In MLR, the studentized residual is the standardized residual divided
> by the square root of (1-leverage) where leverage is the vector of each
> subject's difference from the mean of each indept variable. That I
> think I understand.
>
> But if in ANOVA the standardized residual is already the difference
> from the mean, then it seems redundant to divide it by a term
> containing the leverage which is also based on distance from the mean.
> So in ANOVA, is the studentized resid the same as the standardized
> resid?
Residual ... distance measure in Y direction in ANOVA/regression
Leverage ... distance measure in X direction in ANOVA/regression
> This is a wordy way of asking whether I can legitimately use the
> studentized resids the same way as the standardized resids (i.e.
> outliers = stud. resid > |3| ). If not, could you show me how to
> request the standardized residual?
Both the studentized and standardized residuals have their own
separate purposes, although they overlap quite a bit. They both give
you a way to evaluate the "outlierness" of a data point. As such, I
have no problems using either to look for outliers; in fact, I
prefer the studentized residuals for this task.
--
Paige Miller
Eastman Kodak Company
paige dot miller at kodak dot com
http://www.kodak.com
"It's nothing until I call it!" -- Bill Klem, NL Umpire
"When you get the choice to sit it out or dance, I hope you dance"
-- Lee Ann Womack