| Date: | Mon, 28 Nov 2005 16:40:37 -0500 |
| Reply-To: | Talbot Michael Katz <topkatz@MSN.COM> |
| Sender: | "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU> |
| From: | Talbot Michael Katz <topkatz@MSN.COM> |
| Subject: | Re: Data Cleaning Books (may be OT) |
|
| Content-Type: | text/plain; charset=ISO-8859-1 |
Okay, I'll bite.
Who has done any multivariate outlier detection with SAS? What did you
use? Michael Friendly's OUTLIER macro
(http://ftp.sas.com/samples/A56143)? PROC ROBUSTREG? CALL MCD in PROC
IML? CALL MVE in PROC IML? (I haven't tried any of these yet.) Anything
else? Is there another package you like for multivariate outlier
detection?
-- TMK --
"The Macro Klutz"
On Mon, 28 Nov 2005 12:19:58 -0800, David L Cassell
<davidlcassell@MSN.COM> wrote:
>flom@NDRI.ORG replied:
>>Depending on how you get your data entered in the first place, one
>>thing to do is to tell whoever enters it to make notes of oddities, so
that
>>they aren't suspected of being mistakes.
>>
>>One of our interviewers, who has a bit of a sense of humor,
>>recorded a woman's height as 42 inches. In the margin she wrote
>>"Yes, 42 inches. She's a midget OK?"
>
>Yep. That's why univariate outlier detection methods do bad things a lot
of
>the
>time. You can often address that with multivariate methods. After all,
>that 42"
>height probably goes with a pretty low weight, and a pretty tiny shoe
size,
>and...
>Multivariate methods *ought* to show these are rational values as a
>multi-valued
>set of data. I say "ought", since I can't control what other people do
with
>numbers.
>
>>The case of outliers getting thrown into the wrong pile reminds me of
>>the story of a friend who was one of the first to take the LSAT for entry
>>into
>>law school, and was, apparently, the first to get 800 (as the LSAT used
to
>>be scored)
>>and never heard from Harvard Law because they figured it must be an
>>error.....
>
>I thought the problem was that whole 'robbing banks and making getaways in
>an
>old Duesenberg' thing that you two used to do. :-) :-)
>
>David
>--
>David L. Cassell
>mathematical statistician
>Design Pathways
>3115 NW Norwood Pl.
>Corvallis OR 97330
>
>_________________________________________________________________
>Don’t just search. Find. Check out the new MSN Search!
>http://search.msn.click-url.com/go/onm00200636ave/direct/01/
|