Date: Tue, 13 Jan 2009 07:45:43 -0500
Reply-To: Michael Raithel <michaelraithel@WESTAT.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Michael Raithel <michaelraithel@WESTAT.COM>
Subject: Re: Forcing Indexes and Preventing Index options
Content-Type: text/plain; charset="iso-8859-1"
Irin Later reanimated the following question:
> I am going to reanimate my old question. At this time it
> perhaps sounds even more naiv than before ... However, I
> wonder why SAS SQL "MAY NOT ALWAYS MAKE THE RIGHT DECISION"?
> I mean...why there is no garantee for that? What does
> prevents SAS SQL of doing the right choice regarding index?
Irin, Whoa; be careful with that reanimation stuff; we all saw what can go wrong with it in this prophetic (or is that pathetic) 80's movie: http://www.imdb.com/title/tt0089885/
All seriousness aside; regardless of whether we are talking PROC SQL, another SAS PROC, or the DATA step, SAS approaches the possible use of an index the same way. Before attempting to use an index, it performs some calculations to _ESTIMATE_ if accessing the subset of data via an index is less costly (in terms of computer resources) than reading the entire data set sequentially. If so, then it uses the index; if not, it makes a pass through the entire SAS data set.
SAS performs a three-step process in its index-use estimation:
1. Compute the estimated number of observations qualified by the index.
2. Calculate the I/O cost per Record Identifier (RID)
3. Calculate the number of data pages that would be read by the index.
If SAS _PREDICTS_ that it would be more efficient to use a specific index to return observations than to read the entire SAS data set, then it uses the index. If several indexes are present and could be used to satisfy the WHERE condition, SAS chooses the index with the best predicted efficiency--the least costly.
Most of the time, SAS makes good decisions in its estimation, but unlike several frequent posters on the 'L, it is not infallible. If you want to take control of index use away from SAS, you would use the IDXNAME and/or IDXWHERE options. All of that and more is lovingly detailed, at greater expense to you, in my award-winning book: The Complete Guide to SAS Indexes.
Best of luck to in all of your SAS endeavors, Irin. Later!
I hope that this suggestion proves helpful now, and in the future!
Of course, all of these opinions and insights are my own, and do not reflect those of my organization or my associates. All SAS code and/or methodologies specified in this posting are for illustrative purposes only and no warranty is stated or implied as to their accuracy or applicability. People deciding to use information in this posting do so at their own risk.
Michael A. Raithel
"The man who wrote the book on performance"
Author: Tuning SAS Applications in the MVS Environment
Author: Tuning SAS Applications in the OS/390 and z/OS Environments, Second Edition
Author: The Complete Guide to SAS Indexes
Never hold discussions with the monkey when the organ grinder
is in the room. - Sir Winston Churchill