LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (June 1999, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Fri, 18 Jun 1999 09:16:23 +0100
Reply-To:     tra <tra@PROTEUS.CO.UK>
Sender:       "SAS(r) Discussion" <SAS-L@UGA.CC.UGA.EDU>
Comments:     To: "McReynolds, Dave, VHACIN" <Dave.McReynolds@MED.VA.GOV>
From:         tra <tra@PROTEUS.CO.UK>
Organization: Proteus Molecular Design Ltd
Subject:      Re: Mahalanobis Distance
Comments: To: SAS-L@LISTSERV.VT.EDU
Content-Type: text/plain; charset=us-ascii

Dave,

One way to do it is to use PRINCOMP to do a principle component analysis. If you use the STANDARDIZE option then the M distances are just the sum of squares of the principle component scores.

I have illustrated the method using data from the sample library.

Hope this helps

Tim Auton

/****************************************************************/ /* S A S S A M P L E L I B R A R Y */ /* */ /* NAME: PRINCOEX */ /* TITLE: principal components, */ /* PRODUCT: SAS */ /* SYSTEM: ALL */ /* KEYS: principal components, */ /* PROCS: princomp */ /* DATA: */ /* */ /* REF: */ /* MISC: */ /* */ /****************************************************************/

/*EXAMPLE 1*/ DATA TEMPERAT; TITLE2 'MEAN TEMPERATURE IN JANUARY AND JULY FOR SELECTED CITIES'; INPUT CITY $1-15 JANUARY JULY; CARDS; MOBILE 51.2 81.6 PHOENIX 51.2 91.2 LITTLE ROCK 39.5 81.4 SACRAMENTO 45.1 75.2 DENVER 29.9 73.0 HARTFORD 24.8 72.7 WILMINGTON 32.0 75.8 WASHINGTON DC 35.6 78.7 JACKSONVILLE 54.6 81.0 MIAMI 67.2 82.3 ATLANTA 42.4 78.0 BOISE 29.0 74.5 CHICAGO 22.9 71.9 PEORIA 23.8 75.1 INDIANAPOLIS 27.9 75.0 DES MOINES 19.4 75.1 WICHITA 31.3 80.7 LOUISVILLE 33.3 76.9 NEW ORLEANS 52.9 81.9 PORTLAND, MAINE 21.5 68.0 BALTIMORE 33.4 76.6 BOSTON 29.2 73.3 DETROIT 25.5 73.3 SAULT STE MARIE 14.2 63.8 DULUTH 8.5 65.6 MINNEAPOLIS 12.2 71.9 JACKSON 47.1 81.7 KANSAS CITY 27.8 78.8 ST LOUIS 31.3 78.6 GREAT FALLS 20.5 69.3 OMAHA 22.6 77.2 RENO 31.9 69.3 CONCORD 20.6 69.7 ATLANTIC CITY 32.7 75.1 ALBUQUERQUE 35.2 78.7 ALBANY 21.5 72.0 BUFFALO 23.7 70.1 NEW YORK 32.2 76.6 CHARLOTTE 42.1 78.5 RALEIGH 40.5 77.5 BISMARCK 8.2 70.8 CINCINNATI 31.1 75.6 CLEVELAND 26.9 71.4 COLUMBUS 28.4 73.6 OKLAHOMA CITY 36.8 81.5 PORTLAND, OREG 38.1 67.1 PHILADELPHIA 32.3 76.8 PITTSBURGH 28.1 71.9 PROVIDENCE 28.4 72.1 COLUMBIA 45.4 81.2 SIOUX FALLS 14.2 73.3 MEMPHIS 40.5 79.6 NASHVILLE 38.3 79.6 DALLAS 44.8 84.8 EL PASO 43.6 82.3 HOUSTON 52.1 83.3 SALT LAKE CITY 28.0 76.7 BURLINGTON 16.8 69.8 NORFOLK 40.5 78.3 RICHMOND 37.5 77.9 SPOKANE 25.4 69.7 CHARLESTON, WV 34.5 75.0 MILWAUKEE 19.4 69.9 CHEYENNE 26.6 69.1 ; PROC PLOT; PLOT JULY*JANUARY=CITY/VPOS=26; run; PROC PRINCOMP COV STD OUT=PRIN; /* use the STD option to standardize the pc scores */ VAR JULY JANUARY; run; %let ndim = 2; /* number of variables = number of pc scores */ data prin; set prin; mahdist = uss(of prin1-prin&ndim); /* calculate Mahalanobis distance from mean */ run; /* want to compare distribution of mahdist with a chi-square */ /* use rank to estimate the cumulative distribution function - use nplus1 option to ensure all values strictly between 0 and 1 */ proc rank data=prin nplus1 out=prin; var mahdist; ranks _rank_; run; data prin; set prin; chisq = cinv(_rank_, &ndim); /* chi-square score */ run; proc plot data=prin; plot mahdist*chisq=city chisq*chisq='.'/overlay; run; /* print 'outliers' */ proc print data=prin; where mahdist > 6; run;

-- T R Auton PhD MSc C.Math Head of Biomedical Statistics Proteus Molecular Design Ltd Beechfield House Lyme Green Business Park Macclesfield Cheshire SK11 0JL UK email: tra@proteus.co.uk

"McReynolds, Dave, VHACIN" wrote:

> SAS-Lers- > > How can I compute Mahalanobis Distance to detect multivariate outliers. > > I only have code for doing this on BMDPAM and have no access to it. > > Thanks in advance, > Dave


Back to: Top of message | Previous page | Main SAS-L page