LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (June 2004, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Fri, 11 Jun 2004 15:02:58 -0400
Reply-To:     Steve Albert <salbert@AOL.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Steve Albert <salbert@AOL.COM>
Subject:      Re: Truncating Data series

MS,

You probably don't want to actually delete those records, just omit them from the analysis.

Here are a few approaches that come to mind:

1. Use proc univariate to generate the 1st and 99th percentile values, then either merge that on or hard code it so you can use where-clauses to restrict the data to what you're looking for.

2. Sort the data on your key variable, then assign every record a percentile: data withpctl; set sorteddata; frac=_n_/3000000; * or whatever your exact count is; run;

Now you can use where clauses to trim 1%, or 5%, or .01%, or whatever you want; e.g. %let lowlim=.01; %let uplim=.99; proc whatever data=withpctl(where=(&lowlim < frac and frac < &uplim)); *proc details; title3 "Trimmed data -- lower pctile &lowlim, upper pctile &uplim"; run;

I'm assuming that there's only one variable of concern for the Winsorizing, though the second method is readily extended to trimming on more than one dimension. It also lets you readily investigate the robustness of the results to changes in your trimming rule. (You might also want to see what Winsorizing does; see the recent thread on how to do that.)

By the way, I'd suggest you do some exploration to see how sensitive any results are to your trimming. If the results are very sensitive to your treatment of outliers, then I'd recommend you look at the data very carefully before drawing any conclusions.

Steve Albert


Back to: Top of message | Previous page | Main SAS-L page