Date: Wed, 30 Jul 2003 18:40:21 -0500
Reply-To: Kevin Myers <KevinMyers@austin.rr.com>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Kevin Myers <WHMyers@CABLEONE.NET>
Subject: Re: Index Descending?
Content-Type: text/plain; charset="iso-8859-1"
Please see below...
----- Original Message -----
From: "Michael Raithel" <RAITHEM@WESTAT.COM>
Sent: Wednesday, July 30, 2003 2:56 PM
Subject: Re: Index Descending?
> Dear SAS-L-ers,
> Kevin Myers posted the following:
> > Although Michael could be right here regarding what is
> > happening, one should not necessarily assume that the
> > descending option on the BY statement is what caused use of
> > the index to be canceled. There are plenty of other reasons
> > why use of an index can be canceled that have more to do with
> > statistics regarding data set size and value distribution.
> > If you really wanted to evaluate Michael's assumption, then
> > you'd have to be sure that a large data set was used with a
> > good distribution of values to be used for subsetting.
> > Also, though the docs that Jack and Michael refered to may be
> > right, 1) I don't think that has *anything* to do with WHERE
> > clause optimization, it only affects the ability of the BY
> > clause to use the index, and 2) some of this kind of stuff
> > regularly undergoes minor enhancements, and I wouldn't be too
> > surprised if the docs were out of date on this matter. Could
> > be wrong, but I know that index usage has been improved in
> > some other vaguely similar situations.
> Kevin, it is not that I "could be right", it is that I _AM_ right that the
> DESCENDING option of the BY statement caused the use of the Index to be
> cancelled. Period!
Michael, based on the additional info that you provided here, I believe you.
However, that doesn't change anything about my point that the docs *CAN* be
out of date.
> SAS Online Documentation, SAS Language: Concepts, SAS Data Files: SAS
> Indexes, clearly states:
> In addition, a BY statement will not use an index in these
> The BY statement includes the DESCENDING or NOTSORTED option.
> The index was created with the NOMISS option.
> The data file is physically stored in sorted order based on the
> variables specified in the BY statement.
> ...and as if that is not good enough (which, of course, it is),
Speaking as a former developer for SI, that is *NOT* necessarily good
enough. It is well known that portions of the docs (online or not) have
been out of date on various topics in the past, and IMHO that is unlikely to
change. Although SI tries very hard and does a pretty good job overall,
there are still changes and enhancements made by developers that take a
while before they make it into the docs.
Chapter 6 of
> the upcoming Tuning SAS Applications in the OS/390 and z/OS Environments
> states (or will state) in part:
> SAS determines if it can use an existing index with BY statement
> by asking the following questions:
> 1. Is the DESCENDING option used in the statement?
> 2. Is the NOTSORTED option used in the statement?
> 3. Is varlist composed of a single variable that is the key variable
> a simple index?
> 4. Is varlist composed of a single variable that is the first key
> variable in a composite index?
> 5. Is varlist composed of two or more variables and is the first
> variable the key variable in a simple index?
> 6. Is varlist composed of two or more variables that match the first
> two or more key variables in a composite index?
> If the answer to question 1 or question 2 is "yes", then indexes are not
> used. If the answers to questions 1 and 2 are "no" and the answer to any
> the other questions is "yes", then the appropriate index is used.
> <<Note: I have the author's full permission to use the above excerpt.>>
Well of course now that I know that you are in possession of the most up to
date information, then that proves your statement. But since you didn't
state that in your prior posting, I had no way of knowing that you were
basing your statements on the latest info.
> Kevin, while you make a good point about there being many other reasons
> an Index may not be used, you have wrongly characterized my statement
> the DESCENDING option scuppering the BY statement to be an "assumption",
> when it is actually a "fact".
Agreed this is a fact based on the new info that you've provided, but you
could not tell that for certain from your prior post.
> The WHERE clause in my example was merely a side show provided so that
> a small subset of the data would be sucked out of SHOESORT and plopped
Right, but the message from that "side show" could have led to an invalid
conclusion. The index usage *might* have been canceled for WHERE clause
optimization, and in fact still been valid for use by the DESCENDING option
on the BY statement. The additional info that you have now provided proves
your point, but just the appearance of that message did not prove what you
were trying to show.
> The "minor enhancements" for SAS V9 are such that the DESCENDING option
> still scupper the use of SAS Indexes in BY statement processing.
> Kevin, best of luck to you in your own quest to evaluate just what is what
> regarding SAS Indexes and BY statement processing!
> I hope that this suggestion proves helpful now, and in the future!
> Of course, all of these opinions and insights are my own, and do not
> those of my organization or my associates. All SAS code and/or
> specified in this posting are for illustrative purposes only and no
> is stated or implied as to their accuracy or applicability. People
> to use information in this posting do so at their own risk.
> Michael A. Raithel
> "The man who wrote the book on performance"
> E-mail: MichaelRaithel@westat.com
> Author: Tuning SAS Applications in the MVS Environment
> The only real mistake is the one from which we learn
> nothing. - John Powell