LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (March 2004, week 1)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Sun, 7 Mar 2004 18:49:09 -0500
Reply-To:     Quentin McMullen <quentin_mcmullen@BROWN.EDU>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Quentin McMullen <quentin_mcmullen@BROWN.EDU>
Subject:      Re: detecting changes in data
Comments: To: Ian Whitlock <iw1junk@COMCAST.NET>

On Sun, 7 Mar 2004 21:31:31 +0000, Ian Whitlock <iw1junk@COMCAST.NET> wrote:

>Quentin, > >If I understand the problem, you have ID and RACE at two times. You want to >report: > > Lost ID > Lost Race within ID > Gained or changed Race within ID > >Here is one place that I agree with Sigurd that the NOT IN construct is to be >preferred over EXCEPT. Sometimes I argue for EXCEPT since it is more efficient >in SAS. But here I want the flexibility unless data quantity becomes an issue. > >Here is my code. > >data mon2 ; > input id $ race $ ; >cards ; >1 W >2 A >3 W >3 B >4 B >; >data mon3 ; > input id $ race $ ; >cards ; >1 W >2 B >3 W >3 B >3 H >5 A >; >proc sql ; > select "lost id " as prob > , 2 as month > , id > , race > from mon2 > where id not in ( select id from mon3 ) > union corr all > select "lost race " as prob > , 2 as month > , id > , race > from mon2 > where id||race not in ( select id||race from mon3 ) > union corr all > select "gained or changed race" as prob > , 3 as month > , id > , race > from mon3 > where id||race not in ( select id||race from mon2 ) > and id in ( select id from mon2 ) > > order by month, id, race > ; >quit ; > > >There is some duplication in that a lost ID always implies a lost race. >However, a little tweaking to remove that should be easy if required. I did >not distinguish changes from gains. This might be done by asking whether the >total number of records within an ID increased. Yes, then something must be >gained; no something must have changed. This style is more extravagant in CPU >time but it pays in clarity of what you are getting, and can be easily modified. > >What is the problem with PROC COMPARE after reducing to one record per ID? It is efficient, but leaves the reader the task of identifying the problem. So the real question is - how much work do you want to leave to the reader in sorting out race problems. The classic programmer response - leave anything that makes my code shorter and easier, and that I can get away with. > >Ian_Whitlock@comcast.net

Thanks much Ian!

Looks nifty. Harry Droogendyk was also kind enough to send me a private note with a SQL exclude solution. As I mentioned to him, despite my affinity for the data step, I'm always impressed by how clearly SQL code reads. This is a good example of how a non SQL-head, and perhaps even a non-programmer, would be able to quickly understand the logic.

Thanks again, --Quentin


Back to: Top of message | Previous page | Main SAS-L page