Date: Thu, 13 Jan 2005 12:55:44 -0800
Reply-To: cassell.david@EPAMAIL.EPA.GOV
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "David L. Cassell" <cassell.david@EPAMAIL.EPA.GOV>
Subject: Re: Quantile statistics in PROC MEANS
In-Reply-To: <200501132031.PAA07971@hotellng.unx.sas.com>
Content-type: text/plain; charset=US-ASCII
A Little Birdie(tm) wrote me and pointed out:
me> You can get the same, precise estimates either using PROC UNIVARIATE
me> or using PROC MEANS. They both use the same engine under the hood
me> these days.
<Birdie>
MEANS and SUMMARY are the same under the hood, but UNIVARIATE is quite
different. UNIVARIATE builds a tree of data values for each variable.
MEANS builds trees of class values but not of data values. Most of the
computations in MEANS are done with one or two passes over the data
set without storing the data internally like UNIVARIATE does. The exact
quantiles are the only statistics in MEANS that require storing the
data internally that I know of.
</Birdie>
So now you know.
On a related note, the memory and time requirements of PROC UNIVARIATE
or PROC MEANS using QMETHOD=OS can be tackled with a different approach.
PROC STDIZE can find quantiles in a single pass of the data, a
capability
that can be very useful for large data sets (as in our other thread).
David
--
David Cassell, CSC
Cassell.David@epa.gov
Senior computing specialist
mathematical statistician
|