Date: Tue, 18 Oct 2005 17:00:28 -0700 David L Cassell "SAS(r) Discussion" David L Cassell Re: PROC SQL:How to Calculate the MEDIAN with SQL <200510182125.j9IKMbKi014564@malibu.cc.uga.edu> text/plain; format=flowed

Gary.Shaw@AUSTIN.PPDI.COM wrote: >Has anyone used SQL (in SAS or any other DBS) to determine the MEDIAN? >I like to use SQL to find counts, sums and other stats. >But SAS/SQL and I guess another ANSI SQL, does not have a Meidan() >function? >My internet research has pulled up some solutions for SQL???? >I have developed a solution in SAS/SQL ordering the data and using the >MONOTONIC() function. >The solution works OK, but not straight SQL. That is it requires a few >SQL steps.

I see that Sig has weighed in, in favor of other means (no pun intended). I agree with his position.

ANSI SQL has a short list of descriptive statistics which can all be accrued in a single pas of the data (or a subset thereof) without regard to order. This is meaningful, as the SQL standard has a set-theoretic basis, not an ordered- table basis. So it makes sense to have mean, min, max, sum, count, and so on. All these can be accumulated (easily) in one pass. That cannot be done with median or any other quantile (exceptions: min, max, and other maximal and minimal values that can be tracked in a linked list as we go through the data in that single pass) - not unless you preserve the complete order structure of the data set in one rather large data structure. The alternatives that you have found all work around this in subqueries or multiple passes, taking up perhaps more wallclock time than you really want to spend on a mere median.

Everyone who has a SQL implementation has their own 'enhancements' to the SQL standard. MONOTONIC() is but one of those. And let me warn you to be careful with it. It is undocumented, and it can bite you if you're not careful with it. I believe that Ian Whitlock is one of several experts who have (in addition to Sig) shown potential pitfalls of MONOTONIC() in the archives of SAS-L.

Also, there is no reason to do everything in PROC SQL, any more than there is a reason to do work in everything except PROC SQL. It is merely one tool in a vast toolbox. And you know the old saw (pun intended) about hammers and nails...

David -- David L. Cassell mathematical statistician Design Pathways 3115 NW Norwood Pl. Corvallis OR 97330

_________________________________________________________________ Is your PC infected? Get a FREE online computer virus scan from McAfeeŽ Security. http://clinic.mcafee.com/clinic/ibuy/campaign.asp?cid=3963

Back to: Top of message | Previous page | Main SAS-L page