LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (December 2006, week 1)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Mon, 4 Dec 2006 14:45:31 -0500
Reply-To:   Sigurd Hermansen <HERMANS1@WESTAT.COM>
Sender:   "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:   Sigurd Hermansen <HERMANS1@WESTAT.COM>
Subject:   Re: There's something about HAVING (Re: Selecting nearest value in time)
Comments:   To: "Terjeson, Mark" <Mterjeson@russell.com>, Dimitri Shvorob <dimitri.shvorob@VANDERBILT.EDU>
In-Reply-To:   <16FD64291482A34F995D2AF14A5C932C015A6E7C@MAIL002.prod.ds.russell.com>
Content-Type:   text/plain; charset="us-ascii"

Dimitri and Mark: Use of a HAVING clause for subsetting of rows in a dataset dates back in SAS-L at least to 10/1996. Apparently independent responses by Ian Whitlock and Bernard Tremblay suggest that others knew how to use the method even earlier.

http://www.listserv.uga.edu/cgi-bin/wa?A2=ind9610D&L=sas-l&P=R10401&D=1& H=0&O=D&T=1

I recall posting solutions to several 'Can you do this in SQL?' challenges during that era. I'd call the 'HAVING x=min(x)' a method somewhat specific to SAS SQL. Depending on the selection of columns, it may require remerging.

So how many different methods potentially reduce the cardinality (number of rows) in the yield of a SAS SQL query? Why different methods?

ON x ----> required in INNER and OUTER joins and likely related to query optimizing using indexes.

WHERE x ..... ----> older form of join operator ..... x in (.....) -----> in set constraint exists(....) -----> correlated subquery

HAVING <summary> <op> <value> ---> typical constraint on groups according to summary count, min/max, mean, etc. x=<summary>(x) ---> row constraint within group.

Although a ON/WHERE condition logically precedes a HAVING condition, the SAS SQL compiler may delay imposition of a constraint until it binds values to symbolic references. Sig

-----Original Message----- From: owner-sas-l@listserv.uga.edu [mailto:owner-sas-l@listserv.uga.edu] On Behalf Of Terjeson, Mark Sent: Monday, December 04, 2006 10:26 AM To: Dimitri Shvorob; SAS-L@LISTSERV.UGA.EDU Subject: RE: There's something about HAVING (Re: Selecting nearest value in time)

Hi,

You've guess right. The HAVING is like the WHERE in that it chops out rows (or keeps rows if you prefer). i.e. the where criteria when true keeps rows but the overall where task is that it is subsetting your original data which is chopping out rows.

The WHERE occurs *before* the SELECT, and the HAVING occurs *after* the select. This is handy when you want to subset the original data using the WHERE and possibly filter some records after a roll up as been performed.

The other major difference is that you can incorporated aggregating functions into the HAVING clause and not in the WHERE clause. This allows to perform some aggregating and then you can match row values to it when needed.

So your deducing is spot on.

Hope this is helpful.

Mark Terjeson Senior Programmer Analyst, IM&R Russell Investment Group

Russell Global Leaders in Multi-Manager Investing

-----Original Message----- From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of Dimitri Shvorob Sent: Sunday, December 03, 2006 2:14 PM To: SAS-L@LISTSERV.UGA.EDU Subject: There's something about HAVING (Re: Selecting nearest value in time)

When I came across the SQL-based solution proposed by Arthur in

http://listserv.uga.edu/cgi-bin/wa?A2=ind0611B&L=sas-l&P=R23484

I doubted that it would work. Apart from misgivings about grouping and joining in a single query, I thought that HAVING could only be used to select groups (here, IDs), whereas in Art's code, the clause plainly filters individual records (identified by ID and time). (Consider the following excerpt from p.13 of PROC SQL user's guide: 'The HAVING clause works with the GROUP BY clause to restrict the groups in a query's results based on a given condition. PROC SQL applies the HAVING condition after grouping the data and applying aggregate functions'). To my surprise, test code paralleling the suspect query (see below; I match nearest Saturday and Sunday dates) did work! Am I to conclude that PROC SQL's HAVING is basically a WHERE allowed to reference group aggregates? (Group minimum, in this instance).

Thank you.

data sa su; do id = 1 to 3; do i = 1 to 50; date = intnx('day','1jan2006'd,i); if weekday(date) = 7 then output sa; if weekday(date) = 1 then output su; end; end; format date date9.; run; proc sql; create table sasu as select distinct sa.id, sa.date as sadate, su.date as sudate from sa join su on sa.id = su.id group by sa.id having abs(sa.date - su.date) = min(abs(sa.date - su.date)); quit;


Back to: Top of message | Previous page | Main SAS-L page