LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (March 2009, week 4)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Thu, 26 Mar 2009 23:59:22 -0400
Reply-To:     dynamicpanel@YAHOO.COM
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         dynamicpanel@YAHOO.COM
Subject:      Re: Groupings

Okay, I understood it as to flag the record as 1 if *more than one value of ID_A appears in a combined unit of ID_B, ID_C*.

anyway, it doens't matter, we all have fun coding

On Thu, 26 Mar 2009 22:22:48 -0400, Muthia Kachirayan <muthia.kachirayan@GMAIL.COM> wrote:

>Randy, > >Dan's solution follows the OP's requirement: > >==> So if ID_B with ID_C as one unit appears in *more than one ID_A then >the >flag should be 1 or else the flag should be 0*. Please note that it is a >huge dataset. <== > >A simple modification can eliminate the PROC SORT. Further the dataset can >be scanned only twice as in code posted by dynamicpanel. The hash table can >store only the unique key(ID_B and ID_C) with COUNT as data part - hopefully >the OP's huge dataset may fit into memory. Here is the code. As external >sort is eliminated, this code is expected to be faster. > >data want(drop = count); >if _n_ = 1 then do; > declare hash h(); > h.definekey('id_b','id_c'); > h.definedata('count'); > h.definedone(); > > do until(eof); > set sample end = eof; > if h.find() ne 0 then count = 0; > count ++ 1; > h.replace(); > end; >end; > set sample; > h.find(); > flag = 0; > if count > 1 then flag = 1; >run; > >Muthia Kachirayan > > >On Thu, Mar 26, 2009 at 9:12 PM, <dynamicpanel@yahoo.com> wrote: > >> My previous post has a error, please refer to this one which deals with >> more generic situation. I also trickled the original sample data: I added >> in a duplicate record for the first combination: 1 A XA >> >> /***/ >> data sample; >> input ID_A ID_B $ ID_C $; >> cards; >> 1 A XA >> 1 A XA >> 1 B XA >> 1 C XA >> 2 A XA >> 2 E XB >> 2 F XA >> 3 C XA >> 3 F XC >> 4 F XD >> ; >> run; >> >> proc sort data=sample; by ID_B ID_C ID_A; run; >> >> >> data want; >> set sample; by ID_B ID_C; >> *retain last_g ; >> retain flag 0; >> *j=_n_; >> if first.ID_C & last.ID_C then flag=0; >> else do; >> if flag=0 then do; >> j=_n_+1; IDB=ID_B; IDC=ID_C; eog=0; >> do while(eog=0); >> set sample( rename=(ID_A=IDA ID_B=IDB ID_C=IDC)) >> point=j; >> if IDB=ID_B & IDC=ID_C then do; if >> ID_A^=IDA then flag=1; end; >> else eog=1; >> j+1; >> end; >> end; >> end; >> IDC1=first.ID_C; IDC2=last.ID_C; >> drop IDA IDB IDC eog; >> run; >> >> Final View: >> ID_A ID_B ID_C flag >> 1 A XA 1 >> 1 A XA 1 >> 2 A XA 1 >> 1 B XA 0 >> 1 C XA 1 >> 3 C XA 1 >> 2 E XB 0 >> 2 F XA 0 >> 3 F XC 0 >> 4 F XD 0 >> >> /*******************/ >> *If we change the data into; >> >> data sample; >> input ID_A ID_B $ ID_C $; >> cards; >> 1 A XA >> 1 A XA >> 1 B XA >> 1 C XA >> 2 E XB >> 2 F XA >> 3 C XA >> 3 F XC >> 4 F XD >> ; >> run; >> >> Pattern 1 A XA should have flag=0, but Dan's program assigns flag=1. My >> program works correctly. >>


Back to: Top of message | Previous page | Main SAS-L page