Date: Fri, 27 Aug 2010 15:12:18 -0400
Reply-To: Michael Vostrich <mvostrich@MAIL.RU>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Michael Vostrich <mvostrich@MAIL.RU>
Subject: How to recognize a pattern and assign a group
Content-Type: text/plain; charset=ISO-8859-1
Dear SAS-L experts,
I would greatly appreciate you help in finding a solution to a problem of
grouping the observations in the following situation. I have tests that were
performed in several groups, but the group number is missing. I know that
the tests in a group were performed on “seemingly consecutive” days, while
groups were separated by several weeks (at least, this is true within the
same test number). I need to group the records by assigning the group number
to a group of tests for the same test number.
The example dataset is dataset A below. I calculated the duration, which
gives me a very visual pattern (see the proc print), and I will appreciate
your help with the code that would create GROUPID, which I just added
manually myself to explain what I need (GROUPID serves to group the "similar
tests" within the same test number).
Thank you in advance!
Michael
data a;
format testdate date.;
input testno testdate date.;
cards;
1 15mar2010
1 16mar2010
1 17mar2010
1 31mar2010
1 01apr2010
1 02apr2010
1 20apr2010
1 22apr2010
2 13mar2010
2 15mar2010
2 30mar2010
2 31mar2010
2 01apr2010
2 02apr2010
2 20apr2010
2 24apr2010
;
data b (drop=_:);
set a ;
by testno testdate;
_temp=lag(testdate);
if first.testno then _temp=.;
duration=testdate - _temp;
run;
proc print data=b; run;
testdate testno duration GROUPID
manually added
15MAR02 1 . 1
16MAR02 1 1 1
17MAR02 1 1 1
31MAR02 1 14 2
01APR02 1 1 2
02APR02 1 1 2
20APR02 1 18 3
22APR02 1 2 3
13MAR02 2 . 1
15MAR02 2 2 1
30MAR02 2 15 2
31MAR02 2 1 2
01APR02 2 1 2
02APR02 2 1 2
20APR02 2 18 3
24APR02 2 4 3