Date: Wed, 15 Jul 2009 14:23:36 -0400 "Keintz, H. Mark" "SAS(r) Discussion" "Keintz, H. Mark" Re: PROC RANK To: Joe Matise text/plain; charset="us-ascii"

Shaheen:

Joe's solution works, no doubt about it, but if you need your data returned to its original order, you'd have to do a second sort after the "data want" step.

Here's a single step that takes a little more programming, but preserves original order. It adds the variable GROUP to whatever your original data is (GROUP = 1 to Ngroups).

It allocates the "extra" records to the lower groups, but if you want to allocate to the upper groups, replace do _i = 1 to _gsiz + (_g <= _nx); with do _i = 1 to _gsiz + (_g > (_ng - _nx) );

Downside: a bit more programming.

Upside: One step, less input/output activity.

data have (keep=var extra_var); do i=1 to 506; var=ranuni(0980985); extra_var='extra'; output; end; run;

data want2 (drop=_:);

set have ; ** To get attribute of VAR for hash table use **;

** For storing unique values of VAR in ascending order **; declare hash v (ordered:'ascending'); v.definekey('var'); v.definedata('var','group'); v.definedone(); declare hiter vi ('v');

do until (last_have); set have (keep=var) end=last_have nobs=nrecs; _rc=v.add(); end;

** Now that we have all values, set variable GROUP in hash table V **; _ng=5; ** Number of groups wanted **; _nx=mod(nrecs,_ng); ** N groups getting 1 "extra" record **; _gsiz=floor(nrecs/_ng); ** "Normal" group size **; do _g =1 to _ng; do _i=1 to _gsiz + (_g <= _nx); ** Extra recs allocated to lower group **; _rc=vi.next(); group=_g; _rc=v.replace(); end; end;

** Reread HAVE, retrieve the appropriate GROUP, and output**; do until (last_have2); set have end=last_have2; _rc=v.find(); output; end;

stop; run;

> -----Original Message----- > From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of > Joe Matise > Sent: Wednesday, July 15, 2009 11:33 AM > To: SAS-L@LISTSERV.UGA.EDU > Subject: Re: PROC RANK > > Any reason not to just do this in a datastep?? > > I cheat a little bit to avoid the issue of what to do with the > extraneous > records for when ncount is not evenly divisible by 5. There are lots > of > other ways of dealing with it. This way will give four even groups and > one > group with the remainder (so, 508 records gives 101,101,101,101,104). > > data have; > do _n_ = 1 to 506; > ranking_field = ranuni(7); > output; > end; > run; > proc sort data=have; > by ranking_field; > run; > data want; > set have nobs=ncount; > by ranking_field; > if _n_ = 1 then group=0; > if mod(_n_,floor(ncount/5)) = 1 and _n_ < (ncount-5) then group+1; > run; > proc freq data=want; > tables group; > run; > > -Joe > > On Tue, Jul 14, 2009 at 4:37 PM, Shaheen <siqubal@gmail.com> wrote: > > > Hi, > > I have a dataset with 506 record and I am using PROC Rank to Rank > this > > data into quintiles. I was expecting to see 101-101-101-101-102 > > observation or 102-101-101-101-101 but instead I get > > 100-102-102-100-100. Ranking field has no missing value and no > > duplicates. Is there a way to get the ranking I want? > > > > Regards, > > Shaheen > > > > PROC SORT DATA=InvSys.Test1; > > By Ranking_Field; > > run; > > > > PROC RANK DATA=InvSys.Test OUT=InvSys.Test1 groups=5; > > VAR Ranking_Field; > > RANKS Rank; > > run; > > > > > > > > PROC SQL; > > CREATE TABLE InvSys.Test2 > > AS > > ( > > SELECT Rank,Count(Rank) > > FROM InvSys.Test1 > > Group By Rank > > ); > > quit; > >

Back to: Top of message | Previous page | Main SAS-L page