LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (March 2011, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Mon, 14 Mar 2011 13:52:24 -0400
Reply-To:   "Kirby, Ted" <ted.kirby@LEWIN.COM>
Sender:   "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:   "Kirby, Ted" <ted.kirby@LEWIN.COM>
Subject:   Re: PROC RANK Percentiles vs. PROC UNIVARIATE Percentiles
Content-Type:   text/plain; charset="us-ascii"

I realized that my statement about lower percentiles (ranks) with PROC RANK than given in PROC UNIVARIATE may be due to PROC UNIVARIATE displaying the mid-point of the percentile is incorrect.

The specific case of which I was thinking when discussing this was that for another measure, PROC RANK places the value of 8.02139 in the 73rd percentile (of a 26-observation dataset), while the PROC UNIVARIATE lists that exact value as the number for the 75th percentile of the distribution.

-----Original Message----- From: Kirby, Ted Sent: Monday, March 14, 2011 1:24 PM To: SAS-L@LISTSERV.UGA.EDU Subject: PROC RANK Percentiles vs. PROC UNIVARIATE Percentiles

Is there a way to have PROC RANK percentiles (i.e., PROC RANK groups=100 as indicated in the documentation for PROC RANK) to match percentiles generated by PROC UNIVARIATE on relatively sparse datasets? For example I have a 30-observation dataset in which we want to report to each member their percentile rank for a particular measure of performance. Using PROC RANK I can generate the following table (sorted by percentile):

Measure Percentile 6 4 6 4 15 9 19 12 40 16 43 19 59 24 59 24 70 29 76 32 97 35 103 38 109 41 111 45 116 48 118 51 119 56 119 56 131 62 131 62 158 67 165 70 178 74 299 77 310 80 334 83 358 87 825 90 1,589 93 1,614 96

However, the PROC UNIVARIATE distribution for this dataset is as follows:

Quantile Estimate

100% Max 1614.0 99% 1614.0 95% 1589.0 90% 591.5 75% Q3 178.0 50% Median 117.0 25% Q1 59.0 10% 17.0 5% 6.0 1% 6.0 0% Min 6.0

Notice the PROC RANK puts 825 into the 90th percentile, but PROC UNIVARIATE indicates that the 90th percentile is 591.5.

As I was typing this message, I think I figured it out. The 90th percentile can be considered as a RANGE OF VALUES and thus 825 IS in the 90th percentile because it is between 591.5 and <some other value> that would be the 91st percentile of this distribution. Is that correct?

Another question that occurred to me as I was typing is:

Are the values in the Quantile section of the PROC UNIVARIATE output the lower bound, mid-point or upper bound of the percentile?

The reason I ask is that have other examples where the PROC RANK percentile is lower than the PROC UNIVARIATE percentile and that would be explained if the values in Quantile section were the midpoints of the percentile.

(I know that there may be a difference of one in the percentiles generated by the two procedures since PROC RANK uses 0 to 99 as the minimum and maximum ranks, but PROC UNIVARIATE goes from 0 to 100 when reporting percentiles. However, the discrepancies in percentiles/ranks about which I am concerned are greater than one.) ************* IMPORTANT - PLEASE READ ********************

This e-mail, including attachments, may include confidential and/or proprietary information, and may be used only by the person or entity to which it is addressed. If the reader of this e-mail is not the intended recipient or his or her authorized agent, the reader is hereby notified that any dissemination, distribution or copying of this e-mail is prohibited. If you have received this e-mail in error, please notify the sender by replying to this message and delete this e-mail immediately.


Back to: Top of message | Previous page | Main SAS-L page