LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (February 1996, week 1)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Tue, 6 Feb 1996 15:01:06 -0800
Reply-To:   Ronald Max Atwood Jr <matwood@IX.NETCOM.COM>
Sender:   "SAS(r) Discussion" <SAS-L@UGA.CC.UGA.EDU>
From:   Ronald Max Atwood Jr <matwood@IX.NETCOM.COM>
Subject:   Another SQL question

Hello SAS-L:

First of all I would like to thank everyone who responded to my last SQL problem about using the sum() function in the select statement conditionally. I did not realize there was a case "function". The other popular solution was to use multiple sum() functions and multiply each one by the result of a boolean operator (e.g. sum(a*(c=d))). As a result I was able to extract the various summations I needed while only passing through the data one time.

Now I have a new challange. I am trying to select obs from dsn2 based on the records in dsn1 as follows:

dsn1 looks like: mbr# date1 date2 ...

dsn2 looks like: ssn#1 ssn#2 dos ...

select .... from dsn1 as a, dsn2 as b where (a.mbr#=b.ssn#1 or a.mbr#=b.ssn#2) and (a.date1<=b.dos<=a.date2);

When I tried this query using only one side of the "or" operator above, the query ran in just a couple of minutes (over 100K obs in dsn2). However, when I added the other side of the "or" operator, the process time skyrocketed. (I stopped the process after it had run for over 90 mins.)

I know one solution is to set up two queries (one for each side of the "or" operator, then merge the results and filter/sort to eliminate duplicate hits. But I really do not like going through the data multiple times (must be some querk in my childhood).

My question to SAS-L is how can I "optimize" the query above so that I reduce the run time on a single pass through the data.

Also, I am aware that this query will produce duplicate hits, I have allowed for that by including a "seq. no." in the dsn2. The plan is to sort on the seq. no. later with the nodupkey option in proc sort. However, if someone can show me how to only keep one hit if there are multiple hits, I would appreciate it.

(By the way, I hope these SQL questions do not seem to rudimentary for this list. I do have some SAS trainning and experience. However, I am self taught when it comes to SQL and am just now getting the experience.)

Thanks in advance.

Max Atwood (matwood@ix.netcom.com)


Back to: Top of message | Previous page | Main SAS-L page