Date: Wed, 12 May 2004 21:22:55 -0400
Reply-To: Cornel Lencar <clencar@INTERCHANGE.UBC.CA>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Cornel Lencar <clencar@INTERCHANGE.UBC.CA>
Subject: determining first instance of two consecutive identical values of
a variable by group
Hi,
I have a rather cumbersom problem for me. I had an initial push from
(Roger Lustig - many thanks)) but it seems that that juice was not enough
for my engine. Hoping that the subject title is explanatory enough, I will
reproduce my initial message posted to this list (with a crop of only one
reply) and the resulting(but not yet workable) code...
Any help would be most welcomed,
Sincerely Cornel
> I have a problem related with data manipulation that for me is really
> heavy duty so I dare to ask the list for help and guidance. Basically
> I have a set of people for which hearing tests were taken at
> approximately 1 year interval. Not the same number of tests for each
> individual. Actually it varies between 1 to 26 tests per person. Each
> ear is flagged if a hearing loss is observed or not. This in fact
> represents a short term shift (STS).
>
> If this first STS is followed by another STS then we have a
> confirmation of hearing loss and we can flag the first observation
> where the STS happened as such and retain it, with the date of
> occurance as well. If by any chance (error can happen) a person has a
> batch of two consecutive STS occurences, then no shift then another
> batch of two, only the first batch should be considered and the first
> observation of the two marked as a hearing loss. So for each
> individual, for each ear, the onset of HL will occur only once.
>
> So, if my data set looks like this:;*/;
> data audio;
> input ID STS_R STS_L TestDate;
> format testdate MMDDYY10.;
> cards;
> 1 0 0 05/14/1987
> 1 0 0 06/11/1988
> 1 0 0 06/12/1989
> 1 1 0 04/17/1990
> 1 0 1 06/21/1991
> 1 0 1 05/19/1992
> 1 1 1 07/12/1993
> 1 1 1 06/16/1994
> 1 1 1 07/06/1995
> 2 0 0 06/22/1990
> 2 1 0 07/11/1991
> 2 0 1 06/18/1992
> 2 1 0 06/23/1993
> 2 1 1 04/15/1994
> 2 0 1 06/09/1995
> 2 1 1 05/12/1996
> ;
>
> *the output should contain two more variables, HL_R HL_L, that will
> indicate the onset of the hearing loss, and for the test data would
> look like this:
>
> Obs ID STS_R STS_L TestDate HL_R
> HL_L
> 1 1 0 0 05/14/1987 0
> 0
> 2 1 0 0 06/11/1988 0
> 0
> 3 1 0 0 06/12/1989 0
> 0
> 4 1 1 0 04/17/1990 0
> 0
> 5 1 0 1 06/21/1991 0
> 1
> 6 1 0 1 05/19/1992 0
> 0
> 7 1 1 1 07/12/1993 1
> 0
> 8 1 1 1 06/16/1994 0
> 0
> 9 1 1 1 07/06/1995 0
> 0
> 10 2 0 0 06/22/1990 0
> 0
> 11 2 1 0 07/11/1991 0
> 0
> 12 2 0 1 06/18/1992 0
> 0
> 13 2 1 0 06/23/1993 1
> 0
> 14 2 1 1 04/15/1994 0
> 1
> 15 2 0 1 06/09/1995 0
> 0
> 16 2 1 1 05/12/1996 0
> 0
and the code comes:
data audio1;
set audio;
TestDate1 = INPUT(TestDate, MMDDYY10.);
keep id STS_R STS_L TestDate1;run;
*proc print data=audio1;run;
data audio2;set audio1;
rename TestDate1=TestDate;
format TestDate mmddyy10.;
run;
data audio2;set audio2;
format TestDate mmddyy10.;
run;
*proc print data=audio2;run;
data losses;
do test=1 by 1 until (last.id);
set audio2;
by id;
*** First short-term shift;
if loss_date_left=. and first_test_left=. and sts_l=1 then do;* ;
first_test_left=test;
loss_date_left=testdate;
*date_first_test_left=testdate;
end;
*<same for the right ear>;
*** second short-term shift, if consecutive;
else if loss_date_left=. and sts_l=1 and first_test_left=test -1 then
do;
*loss_left=1;
loss_date_left=testdate;
end;
*<same for the right ear>;*/
*** Negatives and false alarms;
*else first_test_left=.;
if last.id then output;
end;
format loss_date_left MMDDYY10.;
run;
|