|
"data _null_;" <datanull@GMAIL.COM> sagely replied:
> I'm not sure exactly what you mean but when I need to group values
into
> percentiles I look to PROC RANK with the GROUPS= option. The
following
> bit of code may be helpful.
>
> data work.scores;
> do student = 1 to 1e4;
> score = abs(rannor(12345));
> output;
> end;
> run;
> proc rank data=work.scores out=work.scores groups=100;
> var score;
> ranks percentile;
> run;
> proc sort data=work.scores;
> by percentile score;
> run;
> data work.boundary;
> set work.scores;
> by percentile;
> if first.percentile or last.percentile;
> run;
> proc print;
> run;
This is a nice piece of code which demonstrates just what other
responders have been hinting at.
And it points up one thing which I forgot to say. PROC RANK
numbers from zero on up, rather than from 1 on up. So your
'percentiles' will be marked as 0 to 99, rather than the more
typical 1 to 100. Be sure to add a 1 to the percentile values
before you use them.
Oh dear, I forgot about ties too. How do you want to handle the
case where you have several records with the same value, and
those records would get cut between two percentiles? You can
control this behavior using the TIES= option in PROC RANK, but
you have to decide how you want things to work first.
David
--
David Cassell, CSC
Cassell.David@epa.gov
Senior computing specialist
mathematical statistician
|