|Date: ||Mon, 28 Nov 2005 16:40:37 -0500|
|Reply-To: ||Talbot Michael Katz <topkatz@MSN.COM>|
|Sender: ||"SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>|
|From: ||Talbot Michael Katz <topkatz@MSN.COM>|
|Subject: ||Re: Data Cleaning Books (may be OT)|
|Content-Type: ||text/plain; charset=ISO-8859-1|
Okay, I'll bite.
Who has done any multivariate outlier detection with SAS? What did you
use? Michael Friendly's OUTLIER macro
(http://ftp.sas.com/samples/A56143)? PROC ROBUSTREG? CALL MCD in PROC
IML? CALL MVE in PROC IML? (I haven't tried any of these yet.) Anything
else? Is there another package you like for multivariate outlier
-- TMK --
"The Macro Klutz"
On Mon, 28 Nov 2005 12:19:58 -0800, David L Cassell
>>Depending on how you get your data entered in the first place, one
>>thing to do is to tell whoever enters it to make notes of oddities, so
>>they aren't suspected of being mistakes.
>>One of our interviewers, who has a bit of a sense of humor,
>>recorded a woman's height as 42 inches. In the margin she wrote
>>"Yes, 42 inches. She's a midget OK?"
>Yep. That's why univariate outlier detection methods do bad things a lot
>time. You can often address that with multivariate methods. After all,
>height probably goes with a pretty low weight, and a pretty tiny shoe
>Multivariate methods *ought* to show these are rational values as a
>set of data. I say "ought", since I can't control what other people do
>>The case of outliers getting thrown into the wrong pile reminds me of
>>the story of a friend who was one of the first to take the LSAT for entry
>>law school, and was, apparently, the first to get 800 (as the LSAT used
>>and never heard from Harvard Law because they figured it must be an
>I thought the problem was that whole 'robbing banks and making getaways in
>old Duesenberg' thing that you two used to do. :-) :-)
>David L. Cassell
>3115 NW Norwood Pl.
>Corvallis OR 97330
>Donít just search. Find. Check out the new MSN Search!