```Date: Tue, 25 Oct 2005 20:24:13 -0400 Reply-To: Peter Flom Sender: "SAS(r) Discussion" From: Peter Flom Subject: Re: Skewed variables & surveys Comments: To: not_used@COMCAST.NET Content-Type: text/plain; charset=US-ASCII >> NOT_USED 10/25/05 7:11 PM >>> <<< I don't understand why some of the answers to this question imply that OLS is invalid if the dependent variable is skewed or not continuous. OLS is based on the distribution of errors for the correct linear model, so everything is relative to the independent variables. Can't say much until we know about these variables. >>> If the DV is not continuous (or close to it) OLS is the wrong model. A 7 point variable is not enough First, the residuals cannot be continuous, and therefore cannot be normal. Second, the predicted values can be outside the range of the the DV (less than 1 or more than 7) which makes no sense. there are other reasons too, but that's enough <<<< Also-- just because most of the survey answers are 5 6 or 7 does not make the variable skewed-- could still be symmetric around 6 or even 5.5 >>> Actually, no, it couldn't. First of all, in your original post you said 75% give scores of 6 or 7 If all 75% were 6 Then the median is really not determinable, but it's at least 6. The mean can't possibly be 6, because there are none above 6 hmmmmm What if all 75% were 7? then the median is 7 and the mean can't be 7 hmmmmm hmmmmm What if 37% were 6 and 38% were 7? Then the median is 6.5, and the mean can't be 6.5 If the median isn't = to the mean, the distribution is skewed. And if Y is skewed so will the residuals be. (I recall seeing a proof in Faraway's book on linear models, but I don't recall the details). I tested it out, though, with a DV as you describe and a nearly perfect linear model (one IV that was the DV plus some random noise) and the residuals are, sure enough, skewed. OLS is simply NOT THE RIGHT APPROACH. The right approach is, as David and I (and I think others) have told you, to do ordinal or multinomial logistic, preferably using SURVEYLOGISTIC if you have the information, or using LOGISTIC if you do not. Peter Peter L. Flom, PhD Assistant Director, Statistics and Data Analysis Core Center for Drug Use and HIV Research National Development and Research Institutes 71 W. 23rd St http://cduhr.ndri.org www.peterflom.com New York, NY 10010 (212) 845-4485 (voice) (917) 438-0894 (fax) ```

