LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (August 2010, week 4)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Fri, 27 Aug 2010 20:00:28 +0000
Reply-To:   "Keintz, H. Mark" <mkeintz@WHARTON.UPENN.EDU>
Sender:   "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:   "Keintz, H. Mark" <mkeintz@WHARTON.UPENN.EDU>
Subject:   Re: How to recognize a pattern and assign a group
Comments:   To: Michael Vostrich <mvostrich@MAIL.RU>
In-Reply-To:   <201008271912.o7RBZG9i002305@mailgw.cc.uga.edu>
Content-Type:   text/plain; charset="utf-8"

Michael:

1. Decide on a gap size such that: - dates separated by more than that gap belong to different groups. - dates separated by less than or equal to gap belong to the same group.

In your case it looks like a gap size of 10 will work.

2. Sort your data by date into dataset HAVE. proc sort data=mydata out=have; by date test; run;

3. run a program like this, which assumes that the satisfactory GAP size is 10 days:

data want; retain groupnum 0 gap 10; set have; lagdate=lag(date); if _n_=1 or (date-lagdate) > gap then groupnum=groupnum+1; run;

> -----Original Message----- > From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of > Michael Vostrich > Sent: Friday, August 27, 2010 3:12 PM > To: SAS-L@LISTSERV.UGA.EDU > Subject: How to recognize a pattern and assign a group > > Dear SAS-L experts, > > I would greatly appreciate you help in finding a solution to a problem > of > grouping the observations in the following situation. I have tests that > were > performed in several groups, but the group number is missing. I know > that > the tests in a group were performed on “seemingly consecutive” > days, while > groups were separated by several weeks (at least, this is true within > the > same test number). I need to group the records by assigning the group > number > to a group of tests for the same test number. > The example dataset is dataset A below. I calculated the duration, > which > gives me a very visual pattern (see the proc print), and I will > appreciate > your help with the code that would create GROUPID, which I just added > manually myself to explain what I need (GROUPID serves to group the > "similar > tests" within the same test number). > > Thank you in advance! > Michael > > data a; > format testdate date.; > input testno testdate date.; > cards; > 1 15mar2010 > 1 16mar2010 > 1 17mar2010 > 1 31mar2010 > 1 01apr2010 > 1 02apr2010 > 1 20apr2010 > 1 22apr2010 > 2 13mar2010 > 2 15mar2010 > 2 30mar2010 > 2 31mar2010 > 2 01apr2010 > 2 02apr2010 > 2 20apr2010 > 2 24apr2010 > ; > > data b (drop=_:); > set a ; > by testno testdate; > _temp=lag(testdate); > if first.testno then _temp=.; > duration=testdate - _temp; > run; > proc print data=b; run; > > testdate testno duration GROUPID > manually added > > 15MAR02 1 . 1 > 16MAR02 1 1 1 > 17MAR02 1 1 1 > 31MAR02 1 14 2 > 01APR02 1 1 2 > 02APR02 1 1 2 > 20APR02 1 18 3 > 22APR02 1 2 3 > 13MAR02 2 . 1 > 15MAR02 2 2 1 > 30MAR02 2 15 2 > 31MAR02 2 1 2 > 01APR02 2 1 2 > 02APR02 2 1 2 > 20APR02 2 18 3 > 24APR02 2 4 3


Back to: Top of message | Previous page | Main SAS-L page