LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous (more recent) messageNext (less recent) messagePrevious (more recent) in topicNext (less recent) in topicPrevious (more recent) by same authorNext (less recent) by same authorPrevious page (March 2004, week 1)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Thu, 4 Mar 2004 12:52:15 -0500
Reply-To:     "Droogendyk, Harry" <Harry.Droogendyk@CIBC.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         "Droogendyk, Harry" <Harry.Droogendyk@CIBC.COM>
Subject:      Re: detecting changes in data
Content-Type: text/plain; charset="iso-8859-1"

Arrrrghhh. That's not right either, thorough testing is a good thing, the sub-query works with the test data, but not if an 'H' is introduced into month&prev...

I think it's time to start sending attempts to Quentin privately. :-)

-----Original Message----- From: Droogendyk, Harry Sent: March 4, 2004 12:47 PM To: 'SAS-L' Subject: RE: detecting changes in data

Sorry, details.... :-)

%macro comp(prev=,curr=);

proc sql;

create table diffs&curr as select id, race from month&prev except select id, race from month&curr union select id, race from month&curr where race not in ( select p.race from month&prev p, month&curr c where p.id = c.id ) ; quit;

%if &sqlobs = 0 %then %put All is cool; %else %do; %put Something bad changed; proc print data=diffs&curr; run; %end; %mend comp;

%comp(prev=1,curr=2); %comp(prev=2,curr=3);

-----Original Message----- From: Quentin McMullen [mailto:quentin_mcmullen@BROWN.EDU] Sent: March 4, 2004 12:23 PM To: SAS-L@LISTSERV.UGA.EDU Subject: Re: detecting changes in data

Thanks Harry, that's nifty.

Unfortunately it doesn't do as much as I need to do. (and I don't think a simple proc compare can do it either).

As I read below, it basically grabs stuff from from month2 and then subsets out any records that don't match to records in month3. The tricky issue is that it is okay for a new patient to appear in month3. But it is NOT okay for a patient which existed in month2 to have an additional race code added in month3.

That is my patient #3 who goes from: >3 W >3 B

To: >3 W >3 B >3 H

So my comparison needs to allow new patients to be added, but not new races for existing patients.

But thanks for steering me toward EXCEPT. I think it could be an important piece of a solution. I think it's actually pretty close, maybe just an issue of nesting the EXCEPT as a subquery....

Thanks, --Quentin

On Thu, 4 Mar 2004 12:07:20 -0500, Droogendyk, Harry <Harry.Droogendyk@CIBC.COM> wrote:

>The EXCEPT operator in SQL is helpful for these types of comparisons: > >%macro comp(prev=,curr=); > > proc sql; > create table diffs&curr as > select id, race from month&prev > except > select id, race from month&curr > ; > quit; > %if &sqlobs = 0 %then > %put All is cool; > %else > %put Something bad changed; >%mend comp; > >%comp(prev=1,curr=2); >%comp(prev=2,curr=3); > >-----Original Message----- >From: Quentin McMullen [mailto:quentin_mcmullen@BROWN.EDU] >Sent: March 4, 2004 11:58 AM >To: SAS-L@LISTSERV.UGA.EDU >Subject: detecting changes in data > > >Hi All, > >I've got a situation where I receive demographic data every month as a >study progresses and patients are accrued. The file has one record per >patient-race (e.g. a biracial patient will have 2 records). It is >acceptable for the study to add patients over time. However, it is >unacceptable for a patient to be deleted from the dataset or a patient's >race to change. Note that a change in patient race could mean a patient >goes from being white to being black, or goes from being white to being >white and black, or goes from being white and black to being white. > >Below is play data: > >data month1; > input id race $; > cards; >1 W >2 A >3 W >3 B >; >run; > >data month2; > input id race $; > cards; >1 W >2 A >3 W >3 B >4 B >; >run; > >data month3; > input id race $; > cards; >1 W >2 B >3 W >3 B >3 H >5 A >; >run; > >From month 1 to month 2, the study accrued a new case, and other data >remained the same. No problems. > >From month 2 to month 3, there were 3 suspicious changes I need to >identify: >1) Patient 2 went from Asian to Black >2) Patient 3 went from white/black to white/black/hispanic >3) Patient 4 disappeared from the data completely > >So the tricky case is when the patient has multiple races (Patient #3). I >was wondering if folks had interesting ways to approach such a problem? >Feels like there might be a good SQL solution? > >My current thought is to just collapse the data to one record per patient >and have a single race variable containing all the race codes, and do a >PROC COMPARE from that (code below). But I'd be interested in other ideas. > >data month2r (keep=id racelist); > set month2; > retain racelist; > by id; > length racelist $10; > if first.id then racelist=race; > else racelist=trim(left(racelist))||','||race; > if last.id; >run; > >data month3r (keep=id racelist); > set month3; > retain racelist; > by id; > length racelist $10; > if first.id then racelist=race; > else racelist=trim(left(racelist))||','||race; > if last.id; >run; > >proc compare base=month2r compare=month3r listbaseobs; > id id; >run; > >Kind Regards, >--Quentin


Back to: Top of message | Previous page | Main SAS-L page