LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (December 1999, week 4)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Wed, 22 Dec 1999 08:38:22 -0800
Reply-To:     "Berryhill, Tim" <TWB2@PGE.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         "Berryhill, Tim" <TWB2@PGE.COM>
Subject:      Re: Question about improving efficiency in database management
Comments: To: "machellew@MY-DEJA.COM" <machellew@MY-DEJA.COM>,
          sas-l <sas-l@uga.cc.uga.edu>
Content-Type: text/plain

PROC MEANS will run significantly faster for large numbers of groups if you sort the file and use a BY statement instead of a CLASS statement. The CLASS statement requires MEANS to build a table of all combinations of class variables. The BY statement allows MEANS to process a single group, then write it to the output dataset and use the same storage to process the next group. The BY statement requires a sorted file (or a grouped file and the NOTSORTED keyword), the CLASS statement can handle the observations in any order.

Tim Berryhill - Contract Programmer and General Wizard TWB2@PGE.COM or http://www.aartwolf.com/twb.html Frequently at Pacific Gas & Electric Co., San Francisco The correlation coefficient between their views and my postings is slightly less than 0

> ---------- > From: machellew@MY-DEJA.COM[SMTP:machellew@MY-DEJA.COM] <SNIP> > I have a 2 million observation 30 variable administrative dataset which > contains patient and physician variables and a line for each service of > utilization. This means that there may be more than one line for each > patient/physician encounter for a given date. > > If I want to reduce this data so that only one observation exists per > date (that is, I am not interested in the actual service(s) used) > > I know of several options: > > 1) > proc means sum noprint; > class doctor patient date; > var i; > id {list of variables I don't want to lose or have to re-merge in later} > output out=x sum=junk noprint; run; <SNIP> > ... there has to be a better way. The proc means method ran for over 26 > hours when I finally had to halt execution. Is there a more efficient > method using PROC SQL?


Back to: Top of message | Previous page | Main SAS-L page