LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (December 2004, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Mon, 13 Dec 2004 15:58:58 -0700
Reply-To:     Michael Murff <mjm33@MSM1.BYU.EDU>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Michael Murff <mjm33@MSM1.BYU.EDU>
Subject:      Re: Proc Summary vs. Means run times
Comments: To: RHOADSM1@WESTAT.COM
Content-Type: text/plain; charset=US-ASCII

Hi Mike,

How does one become privy to what SAS does behind the scenes? Have they revealed some of their source code, presumably written in C? I thought they kept such under very tight lock and key due to competitors like SPSS and STATA. My understanding is that the procs are pre-compiled binaries, and that datastep code is sort of translated down to C syntax. Could you elaborate or refer me to other sources (papers) that would have more info. as to what goes on "under the hood" when a SAS proc or datastep code is submitted.

Thanks,

Michael Murff

PS--Perhaps I should relist this under a new topic, but I'll have to consult the said SAS etiquette paper, to be sure about that :)

>>> Mike Rhoads <RHOADSM1@WESTAT.COM> 12/13/2004 3:46:44 PM >>> Dave,

Welcome to the group!

Actually, PROC MEANS and PROC SUMMARY run exactly the same code behind the scenes. There are a couple of very minor differences, mainly that by default PROC MEANS produces printed output and PROC SUMMARY does not.

So I suspect the differences you are seeing in output format and execution time are because you are using a BY statement in your PROC MEANS vs. a CLASS statement in PROC SUMMARY. Try using the same statement in both, and you should get identical output and near-identical run times.

Mike Rhoads Westat RhoadsM1@Westat.com

-----Original Message----- From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of David Meyer Sent: Monday, December 13, 2004 5:32 PM To: SAS-L@LISTSERV.UGA.EDU Subject: Proc Summary vs. Means run times

Hi SASLers,

As a new-ish SAS guy, I have been following the SASL discussion as much as I can and I have been learning a lot (THANKS ALL). As I have been improving, I have caught the "try to write tighter code" bug from some of you and since I am working with large data sets (millions of records each), reducing run time is a very practical obsession to have.

I have recently discovered Proc Summary and been playing with it and Proc Means. I think that I found Summary to be about 35 to 45% of the run time of Proc Means (plus I like the "class variable crude" summary data line in Proc Summary and I like the way the data is displayed in the output window better then Means). If all I wanted is basic summary stats (mean min max std) should I always be using Summary going forward? Am I making any assumptions that I should worry about / or are incorrect? Do any of you suggest places for me to go and read up on these basic statistical Procs?

TIA and thanks for all of your discussion on other topics,

Dave

Below are the code and log results:

625 proc summary data=visit_sum missing print; 626 class member_no; 627 var day_diff ; 628 output out=diffs mean=Mean std=STDev ; 629 run;

NOTE: There were 48 observations read from the dataset WORK.VISIT_SUM. NOTE: The data set WORK.DIFFS has 13 observations and 5 variables. NOTE: PROCEDURE SUMMARY used: real time 0.62 seconds cpu time 0.05 seconds

630 631 632 proc means data=visit_sum missing print; 633 by member_no; 634 var day_diff ; 635 output out=diffs1 mean=Mean std=STDev ; 636 run;

NOTE: There were 48 observations read from the dataset WORK.VISIT_SUM. NOTE: The data set WORK.DIFFS1 has 12 observations and 5 variables. NOTE: PROCEDURE MEANS used: real time 0.28 seconds cpu time 0.03 seconds


Back to: Top of message | Previous page | Main SAS-L page