LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (February 2011, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Sun, 13 Feb 2011 20:16:43 +0000
Reply-To:   "Keintz, H. Mark" <mkeintz@WHARTON.UPENN.EDU>
Sender:   "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:   "Keintz, H. Mark" <mkeintz@WHARTON.UPENN.EDU>
Subject:   Re: why query doesn't use index?
Comments:   To: Arthur Tabachneck <art297@ROGERS.COM>
In-Reply-To:   <201102131923.p1DBlICD025624@waikiki.cc.uga.edu>
Content-Type:   text/plain; charset="iso-8859-1"

And that would make sense sence the compound index would be stored in ascending order of the compound value, thereby facilitating simple binary search for specific values of the first component. Of course, there would be no cost-effective way to search the same index for a given value of the second or later components.

The take-away message: when forming a compound index think about which component is more likely to be used as a simple search criterion. In the absence of other considerations (e.g. the sort order of the dataset), that should be the first component.

Regards, Mark

> -----Original Message----- > From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of > Arthur Tabachneck > Sent: Sunday, February 13, 2011 2:24 PM > To: SAS-L@LISTSERV.UGA.EDU > Subject: Re: why query doesn't use index? > > Joe, > > I'm back to totally agreeing with you. The ver8 documentation states: > "Often, only the first variable of a composite index is used. For > example, > for a composite index on ZIPCODE and SCHOOLID, the following WHERE > expression can use the composite index for the variable ZIPCODE because > it > is the first key variable in the composite index: > > where zipcode = 78753;" > > Thus, it isn't compound optimization but, rather, simple optimization > of a > composite index. > > Art > ------- > On Sun, 13 Feb 2011 10:36:38 -0500, Arthur Tabachneck > <art297@ROGERS.COM> > wrote: > > >Joe, > > > >Of course, that doesn't make it definitive given the example presented > by > >Max. If one takes MMMMIIIIKKKKEEEE's statement literally, it doesn't > >necessarily hold as the following would not be true: "Compound > optimization > >takes place when several variables of a composite index are used in a > WHERE > >expression." > > > >But, in fairness, it doesn't state that compound optimization ONLY > takes > >place when .. etc., etc. > > > >Art > >--------- > >On Sun, 13 Feb 2011 10:27:15 -0500, Arthur Tabachneck > <art297@ROGERS.COM> > >wrote: > > > >>Joe, > >> > >>I agree. So that you don't end up going to work just to get the > book, one > >>of Michael's SUGI30 papers states > >>(http://www2.sas.com/proceedings/sugi30/247-30.pdf) : > >> > >>To utilize a Composite index in a WHERE expression, SAS uses > "compound > >>optimization". Compound optimization takes place when several > variables of > >a > >>composite index are used in a WHERE expression and are joined > together > with > >>logical operators such as AND and OR. One of the following conditions > must > >>be true for compound optimization to occur: > >> > >>1. At least the first two key variables in the composite index must > be > used > >>in the WHERE condition: > >> where state eq 15 and county eq 30 and population lt 20000; > >> > >>2. The conditions are connected using the AND logical operator: > >> where state eq 15 and county eq 30; > >> > >>3. Any conditions using the OR logical operator must specify the same > >>variable: > >> where state eq 20 and (county eq 30 or county eq 40); > >> > >>4. At least one condition must be the EQ or IN operator: > >> where state eq 10 and county in(1,3,5); > >> > >>Art > >>-------- > >>On Sun, 13 Feb 2011 08:28:00 -0600, Joe Matise <snoopy369@GMAIL.COM> > wrote: > >> > >>>Ken, > >>>That is not correct. SAS will use the index despite it being a > composite > >>>index with additional fields. I don't have Mikeeeeeeees' book with > me > >>(it's > >>>at work), but I think that SAS is consistent with other languages in > >>>allowing use at least if the first elements of the composite index > match > >>the > >>>first elements of the query. Max demonstrates this with his first > query, > >>>for example. > >>> > >>>I would assume that the IN operator is the culprit here; it is not > as > fast > >>>as equals, particularly in relation to indexes, and between that and > >simply > >>>the fact that SAS is now returning a larger proportion of the > dataset, > >it's > >>>possible SAS has now decided that the index is no longer faster. If > SAS > >is > >>>bringing back 80% of the dataset or something, it's often not really > >useful > >>>to use the index - normal reads may be just as fast, given likely > the > data > >>>could be written such that you'd read the nonuseful data into memory > >anyway > >>>(reading a block at a time). In that case the time to read the > index is > >>>actually harmful...or at least equivalent. > >>> > >>>-Joe > >>> > >>>On Sun, Feb 13, 2011 at 8:16 AM, Ken Borowiak > >><evilpettingzoo97@aol.com>wrote: > >>> > >>>> > >>>> > >>>> Max, > >>>> > >>>> You created a composite index, an index on information on more > than one > >>>> field. > >>>> To invoke use of a composite index you need to use all relevant > fields > >in > >>>> the composite > >>>> index in the WHERE clause/statement. > >>>> > >>>> In your example below, you are subsetting on FLIGHTNUMBER only. > Try > >>>> creating > >>>> a simple index on the field and re-run your test. > >>>> > >>>> proc sql ; > >>>> create index flightnumber on marchflights ; > >>>> quit ; > >>>> > >>>> hth, > >>>> Ken > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> -----Original Message----- > >>>> From: Joe Whitehurst <joewhitehurst@GMAIL.COM> > >>>> To: SAS-L <SAS-L@LISTSERV.UGA.EDU> > >>>> Sent: Sat, Feb 12, 2011 5:21 pm > >>>> Subject: Re: why query doesn't use index? > >>>> > >>>> > >>>> From a SUGI 30 paper by Michael A. Raithel, Westat, Rockville, MD > The > >>>> Basics > >>>> > >>>> of Using SAS > >>>> > >>>> � > >>>> > >>>> Indexes > >>>> > >>>> http://www2.sas.com/proceedings/sugi30/247-30.pdf > >>>> > >>>> > >>>> > >>>> SAS will not necessarily us an existing index even when you do use > a > >>WHERE > >>>> > >>>> or BY statement. SAS first calculates > >>>> > >>>> if using an index will be more efficient than reading the entire > data > >set > >>>> > >>>> sequentially. The internal algorithms take a lot > >>>> > >>>> of factors into consideration, including data set size, the index > or > >>>> indexes > >>>> > >>>> that are available, and centile information. > >>>> > >>>> (For more information on �centiles�, see the section of this > paper > >>titled > >>>> > >>>> Centiles). If SAS predicts that it will be more > >>>> > >>>> efficient to use a specific index to return observations than to > read > >the > >>>> > >>>> entire data set, then it will use that index. If > >>>> > >>>> not, then it will read the entire data set, sequentially, to > return the > >>>> > >>>> observations. However, SAS will not even consider > >>>> > >>>> using an index if you do not use a WHERE or BY statement (and, of > >course, > >>>> if > >>>> > >>>> you do not use the KEY option). > >>>> > >>>> Most of the time, SAS makes good decisions regarding whether or > not to > >>use > >>>> > >>>> an index. But, its internal calculations > >>>> > >>>> are not infallible, and sometimes the resources consumed when > reading a > >>>> > >>>> large subset of data via an index are > >>>> > >>>> greater than reading the entire SAS data set. You can use the > IDXNAME= > >>and > >>>> > >>>> IDXWHERE= options to overeride > >>>> > >>>> SAS� default index usage. > >>>> > >>>> > >>>> > >>>> On Sat, Feb 12, 2011 at 4:13 PM, bbser 2009 <bbser2009@gmail.com> > wrote: > >>>> > >>>> > >>>> > >>>> > Could you explain specifically why the last Query below does not > use > >>the > >>>> > >>>> > index, while the last second does? > >>>> > >>>> > Thanks. > >>>> > >>>> > > >>>> > >>>> > -Max > >>>> > >>>> > > >>>> > >>>> > =============================================== > >>>> > >>>> > data marchflights; > >>>> > >>>> > set sasuser.marchflights; > >>>> > >>>> > run; > >>>> > >>>> > > >>>> > >>>> > proc sql; > >>>> > >>>> > create unique index daily > >>>> > >>>> > on marchflights(flightnumber, date); > >>>> > >>>> > quit; > >>>> > >>>> > > >>>> > >>>> > options msglevel=i; > >>>> > >>>> > > >>>> > >>>> > proc sql; > >>>> > >>>> > select * > >>>> > >>>> > from marchflights > >>>> > >>>> > where flightnumber="182"; > >>>> > >>>> > quit; > >>>> > >>>> > > >>>> > >>>> > proc sql; > >>>> > >>>> > select * > >>>> > >>>> > from marchflights > >>>> > >>>> > where flightnumber in ("182", "202"); > >>>> > >>>> > quit; > >>>> > >>>> > > >>>> > >>>> > >>>> > >>>> > >>>> > >>>>


Back to: Top of message | Previous page | Main SAS-L page