LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (July 2011, week 4)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Fri, 22 Jul 2011 08:54:30 -0400
Reply-To:   Jim Groeneveld <jim.1stat@YAHOO.COM>
Sender:   "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:   Jim Groeneveld <jim.1stat@YAHOO.COM>
Subject:   Re: Identifyiing groups

Hi Randy,

Sorry, I meant to say: there is NO rule of thumb.

Regards - Jim. >-- >Jim Groeneveld, Netherlands >Statistician/SAS consultant >http://jim.groeneveld.eu.tf

On Fri, 22 Jul 2011 08:48:41 -0400, Jim Groeneveld <jim.1stat@YAHOO.COM> wrote:

>Hi Randy, > >There is rule of thumb; it is dependent of many circumstances. >You just have to test a few different approaches (algorithms) >with a moderate set of data. A (small) data step approach is: > >DATA Flags; > MERGE Events /* assumed sorted by Date, ID_A, ID_B, ID_C */ > Events (RENAME=(Event=Event_Flag) WHERE=(Event_Flag EQ 1)); > BY Date ID_A ID_B ID_C; > IF (MISSING(Event_Flag)) THEN Event_Flag = 0; >RUN; > >This merges Events with a (slightly modified) subset of itself, >while initialising Event_Flag to missing for each new group, >possibly reading its group value for the first record of a group >and retaining Event_Flag values of 1 over a whole group >(group made of unique values for Date, ID_A, ID_B and ID_C). > >This can also be interpreted and explained as merging a single >(or a few) group record(s) multiple times with many group records. > >Regards - Jim. >-- >Jim Groeneveld, Netherlands >Statistician/SAS consultant >http://jim.groeneveld.eu.tf > > >On Fri, 22 Jul 2011 07:21:18 -0400, Randy <randistan69@HOTMAIL.COM> wrote: > >>Dear All: >> The data set that I am working with is very large. Will a data step >>solution be better? >> Randy >> >> >>On Fri, 22 Jul 2011 04:47:33 -0400, Jim Groeneveld <jim.1stat@YAHOO.COM> wrote: >> >>>Hi Randy, >>> >>>There are many solutions possible; one using PROC SQL is: >>> >>>DATA Events; >>> INFILE CARDS TRUNCOVER; >>> INPUT Date $ ID_A $ ID_B $ ID_C $ Event; * (ID_A also read as character); >>> CARDS; >>>01JUL 1 X A >>>01JUL 1 X A >>>01JUL 1 X A >>>01JUL 1 X B >>>01JUL 1 X B 1 >>>01JUL 1 X B >>>01JUL 2 X A >>>01JUL 2 X A >>>01JUL 2 X A >>>02JUL 1 Y C >>>02JUL 1 Y C >>>02JUL 1 Y C 1 >>>; >>>RUN; >>> >>>* Corrections of inconsistencies; >>>* - O1JUL (letter O) changed to 01JUL (digit 0); >>>* - removed Event=1 for 01Jul/2/X/A as in result; >>> >>>TITLE "Dataset = Events"; PROC PRINT DATA=Events; RUN; >>> >>>PROC SQL; >>> CREATE TABLE Flags AS >>> SELECT *, >>> CASE MAX(Event) >>> WHEN . THEN 0 >>> ELSE 1 >>> END AS Flag_Event >>> FROM Events >>> GROUP BY Date, ID_A, ID_B, ID_C; >>>QUIT; >>> >>>TITLE "Dataset = Flags"; PROC PRINT DATA=Flags; RUN; >>> >>>Regards - Jim. >>>-- >>>Jim Groeneveld, Netherlands >>>Statistician, SAS consultant >>>http://jim.groeneveld.eu.tf >>> >>> >>>On Thu, 21 Jul 2011 23:12:25 -0400, Randy <randistan69@HOTMAIL.COM> wrote: >>> >>>>Dear all: >>>> >>>>My data is as follows >>>> >>>>Date ID_A ID_B ID_C Event >>>>O1JUL 1 X A >>>>01JUL 1 X A >>>>01JUL 1 X A >>>>01JUL 1 X B >>>>01JUL 1 X B 1 >>>>01JUL 1 X B >>>>01JUL 2 X A >>>>01JUL 2 X A 1 >>>>01JUL 2 X A >>>>02JUL 1 Y C >>>>02JUL 1 Y C >>>>02JUL 1 Y C 1 >>>> >>>>The data that I want is as follows: >>>> >>>>Date ID_A ID_B ID_C Event Flag_Event >>>>O1JUL 1 X A 0 >>>>01JUL 1 X A 0 >>>>01JUL 1 X A 0 >>>>01JUL 1 X B 1 >>>>01JUL 1 X B 1 1 >>>>01JUL 1 X B 1 >>>>01JUL 2 X A 0 >>>>01JUL 2 X A 0 >>>>01JUL 2 X A 0 >>>>02JUL 1 Y C 1 >>>>02JUL 1 Y C 1 >>>>02JUL 1 Y C 1 1 >>>> >>>>So whenever an event has occurred, I want the event to be flagged for those >>>ID's >>>> >>>> Thanks >>>> Randy


Back to: Top of message | Previous page | Main SAS-L page