LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (July 2009, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Thu, 9 Jul 2009 11:17:20 -0400
Reply-To:     "Keintz, H. Mark" <mkeintz@WHARTON.UPENN.EDU>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         "Keintz, H. Mark" <mkeintz@WHARTON.UPENN.EDU>
Subject:      Re: Grouping similar codes occuring across different states
Comments: To: chumba <vikas.dharamsattu@GMAIL.COM>
In-Reply-To:  <5aa1e850-8b7d-4357-8cdb-5c7b2a0d85af@k1g2000yqf.googlegroups.com>
Content-Type: text/plain; charset="us-ascii"

Chumba says:

> -----Original Message----- > From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of > chumba > Sent: Thursday, July 09, 2009 9:37 AM > To: SAS-L@LISTSERV.UGA.EDU > Subject: Grouping similar codes occuring across different states > > Hi, > > I have a variable state and a variable code in my data, > > each state can have different codes and vice versa such that codes > like AA.AAA,TR,AS,UC,FA, etc. can occur > for multiple states. > > I need to re configure my data in such a way that all the similar > occuring codes are grouped according to their > corresponding states in descending order of their frequencies. > > Could anyone please help me out with this, > > Thanks in advance.

Chumba:

Assuming data set HAVE has only one record for each STATE/CODE combination, the code below should do what you want.

It does the following:

1. Reads in every record and puts it in a hash table HAVE keyed by N (the incoming record number). While reading data, it tracks the frequency of each code and maintains it in a hash table CF, keyed by CODE.

2. Declare a new hash table FINAL keyed by descending CODE_FREQ CODE STATE.

3. Read each item from HAVE, get corresponding CODE_FREQ from CF, and add it to the properly ordered hash table FINAL.

In this process I multiply CODE_FREQ by -1. This allows declaring FINAL in ascending order, yet starting with the highest observed frequency and proceeding to the lowest frequency, and also yielding STATEs in alphabetic order for each CODE.

If I didn't multiple CODE_FREQ by -1, I'd have to put FINAL in descending order, forcing a reverse alphabetic list of STATE for each CODE (and reverse alpha list of CODE for each CODE_FREQ).

Code follows:

data _null_;

set have (obs=1) ; ** Gets lengths of all variables for hash object HAVE **;

declare hash have (); have.definekey('n'); have.definedata('code','state','otherdata'); have.definedone();

length code_freq 8;

declare hash cf (hashexp:4); cf.definekey('code'); cf.definedata('code_freq'); cf.definedone();

do n=1 by 1 until (end_of_have); set have end=end_of_have; rc=have.add(); if cf.find()=0 then code_freq=code_freq+1; else code_freq=1; rc=cf.replace(); end;

nrecs=n;

declare hash final(hashexp:4,ordered:'ascending'); final.definekey('neg_code_freq','code','state'); final.definedata('code_freq','code','state','otherdata'); final.definedone();

do n=1 to nrecs; rc=have.find(); rc2=cf.find(); neg_code_freq = -1*code_freq ; rc3=final.add(); end;

rc=final.output(dataset:'want'); run;

Regards, Mark


Back to: Top of message | Previous page | Main SAS-L page