LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (January 2002, week 5)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Tue, 29 Jan 2002 10:42:47 -0500
Reply-To:     Nicholson Warman <newarman@OFFICE.UNCG.EDU>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Nicholson Warman <newarman@OFFICE.UNCG.EDU>
Subject:      Re: Comments on SAS Efficiencies
Content-Type: text/plain; charset=US-ASCII

Some B-I-G efficiencies I have found, at a number of sites, include: - Add DROP=/KEEP=/WHERE= clauses on dataset references, to reduce the amount of processing of data to build the SAS program data vector. It is far cheaper, especially on LARGE database tables, to filter as early as possible, rather than using DROP/KEEP/IF statements to subset your data. Example:

data WORK.NEW_DATA; set OLDLIB.OLD_DATA ( keep=A B C D where=(A > 5 and B < C < D) ); ... run;

versus

data WORK.NEW_DATA; set OLDLIB.OLD_DATA; KEEP A B C D; if A gt 5 and B < C and C < D; ... run;

- For SAS 8, when accessing a database, you need to benchmark the options, but LIBNAME access to database tables is often cheaper than SQL Pass-Thru or SAS Views of SQL statements. Again, the WHERE= can be key, as you can have your database engine filter the data for you and only return the needed records. Otherwise, if you only need 2% of a 1 Terabyte database table, the system doesn't have to give you all the data, for you to THEN throw away 98%; you get what you want and that's IT!

Nick Warman Data Analysis Consultant Instructional & Research Computing University of North Carolina at Greensboro (336) 334-5350 [reception]

>>> Charles Patridge <Charles_S_Patridge@PRODIGY.NET> 01/29/02 08:40 AM >>> OK SAS-Lers,

I was asked the question this morning about what are your top 5 suggestions on SAS efficiencies, and here is what I had to say - maybe some of you may disagree which is why SAS-L is an excellent forum for such discussions.

And if you are willing to provide your two cents, I may even try to capture all the $.02 worth of ideas and pack them into another Tip and Tehnique page for SCONSIG.com <grin> - so share your thoughts and ideas on this topic.

Dear Michael,

First, when I start to develop a new program, I make use of real data in testing by using OPTIONS OBS=100 or so. This keeps files small and still allows me to test my code.

This is more development efficiencies than SAS efficiencies.

1. Given the power of Proc SQL I will still use Data Step programming techniques to sometimes overcome the inefficiencies of SQL. You need to benchmark both ways when dealing with large datasets. Sometimes SQL will outperform Data Step but then again, your Data Step logic may not be the most efficient.

2. I tend use Proc Format to do table lookups instead of merging files as long as my lookups are not too large, say under 20000 records.

Also, look into using SAS arrays where possible as they can save a lot of CPU.

3. When dealing with large files and MANY variables - use the KEEP option when reading SAS Datasets unless all the variables are needed later in the process.

4. Depending on the OS, you may have disk caching or read ahead features in SAS that can be utilized. Look at the SAS OS Companion Guide for your OS platform to see if they exist.

5. One efficiency technique I overlook (on purpose) is to use SAS views as opposed to SQL PassThru when accessing external Databases. I find maintaining SQL Passthru code to be overburdening as opposed to little maintenance on SAS Views (using ACCESS). Again, here is an example that SAS efficiency may cost you more in terms of maintenance costs than execution costs.

6. Try to design SAS programs using Table Driven System Designs as opposed to making program changes. It is easier to maintain tables than programs even if it takes a little longer to execute your SAS code. People costs are expensive and real dollars as opposed to CPU costs (somewhat). Plus data/system integrity is critical for users as opposed to system running a little slower.

7. Finally, make sure everyon on your team understands the SAS techniques you employ in your development. What good does it do when you become the best SAS programmer but you are the only one who knows how to maintain it? Make sure you share your knowledge with code walk throughs, quick SAS tutorials over lunch or a brief meeting to explain the code.

BTW, having code walk throughs is a good idea for any major project development as you go through unit testing. This enables the team to discuss different methods as well as convey SAS techniques to all team members, and share what your piece of the system is doing - Excellent Practice for all development (SAS or not).

Have to get back to work. Hope this helps even though there can be many other tips to offer as this topic. You might consider soliciting the SAS-L forum.

Regards, Charles Patridge --- Original Message ---

To: Charles_S_Patridge@prodigy.net Subject: SAS Efficiencies Question

>Hello Mr. Patridge,

I am a SAS programmer and have been given the task of giving a presentation to my firm's SAS Users Group on the topic of SAS efficiencies. I have found much documentation on the subject and have many ideas for what I believe is an efficient way to code. However, I have a limited time (15 min.) in which to give this presentation and am stuck trying to define a list of the top 5-10 "most efficiency for the buck". I have seen and read much of your SAS work and techniques on the Internet, so I am hoping you can extend to me those efficiencies you feel are most important when you write a program.

Any help you can provide would be much appreciated.

Sincerely,

Michael Hermeston


Back to: Top of message | Previous page | Main SAS-L page