LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (August 2007, week 1)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Mon, 6 Aug 2007 23:17:37 -0400
Reply-To:     "Howard Schreier <hs AT dc-sug DOT org>" <nospam@HOWLES.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         "Howard Schreier <hs AT dc-sug DOT org>" <nospam@HOWLES.COM>
Subject:      Re: compare variables within same dataset

See comments, etc. inline.

On Mon, 6 Aug 2007 18:09:50 +0000, toby dunn <tobydunn@HOTMAIL.COM> wrote:

>Proc SQL ; >Create Table Need As >Select ID , Race >From Have > Group By ID > Having Count( Distinct Race ) > 1 ; >Quit ;

Including RACE leads to a remerge, so there will be a bunch of exra rows in the result (if Sarah's request is taken literally, she only wants one number; I'm assuming however that she really wants a list of distinct IDs).

> > > > > >Toby Dunn > > >From: Sarah Tran <st.tran@GMAIL.COM> >Reply-To: Sarah Tran <st.tran@GMAIL.COM> >To: SAS-L@LISTSERV.UGA.EDU >Subject: compare variables within same dataset

There are ways to compare a data set with itself. Typically it involves subsetting on one or both sides of the comparison. But this is really a counting problem more than a comparing problem.

>Date: Mon, 6 Aug 2007 10:55:29 -0700 > >I have the following dataset where each observation represents a >hospital visit, ID variable is a unique identifier for a patient, and >RACE is their race. I want to count how many people had their race >recorded inconsistently any time during their hospital visits. For >example, only ID's #2 and 4 had their races recorded inconsistently in >the visits they had. The full dataset has over 30,000 observations and >a patient (e.g., a unique ID) can appear over 50 times. > >I tried using PROC COMPARE but can't get it to do what I want it to >do.

PROC COMPARE indeed is not a good fit for the present problem.

>Does anyone have suggestions? Maybe with MACROS?

Don't be so eager for a macro. If anything, be reluctant. Don't think that when a problem seems difficult, a macro must be the solution. The macro facility is not an all-purpose supercharger for SAS.

>Thanks for any help.

Others have provided solutions.

For a counting problem, PROC FREQ should come to mind. In this case you need back-to-back PROC FREQ steps.

proc freq data=have noprint; tables id * race / out=halfbaked(drop = count percent); run;

proc freq data=halfbaked noprint; tables id / out=need(drop = percent where = (count>1) ); run;

> >ID RACE >1 A >1 A >2 A >2 O >2 A >3 B >3 B >4 N >4 N >4 O >4 O >4 O >.. >.. >..


Back to: Top of message | Previous page | Main SAS-L page