Date: Thu, 5 Oct 2000 15:41:10 -0400
Reply-To: Patrick Johnston <Patrick_Johnston@ABTASSOC.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Patrick Johnston <Patrick_Johnston@ABTASSOC.COM>
Subject: Re: Standard Deviation
Content-Type: text/plain; charset=US-ASCII
Venky,
Proc Univariate gives the square root of the *unbiased* variance
estimate = SSY/(n-1), so (110-110)/(1-1) = 0/0 should be missing.
This also applies to the std( ) function, but you entered 110 twice so
n=2 in that case. std(110) will give an error (does not have enough
arguments).
Patrick.
______________________________ Reply Separator _________________________________
Subject: Standard Deviation
Author: "Chakravarthy; Venky" <Venky.Chakravarthy@PFIZER.COM> at internet
Date: 10/5/00 2:28 PM
Hi,
I noticed the following while generating some reports and am wondering
whether it is a bug.
One of the several groups in my data set has n=1 and the lone member has a
value of 110. The standard deviation produced from PROC UNIVARIATE for this
group is a missing value(.). Is this correct? Should it not be zero(0). If
one goes by the definition of standard deviation then std=sqrt of [(square
of (110-110))/1] which equals 0. Consider the following equivalent to my
data.
I tested this on 6.12 NT and 8.0 Unix with identical results.
data test;
x=110;
run;
proc univariate data=test;
run;
This results in the following output ( I have retained only the relevant
portions):
N 1 Sum Wgts 1
Mean 110 Sum 110
Std Dev . Variance .
However, when I used the STD function in a data _null_ step it resulted in
the correct value.
18021 data _null_;
18022 std=std(110,110);
18023 put std=;
18024 run;
STD=0
NOTE: The DATA statement used 0.02 seconds.
Any thoughts on why the proc univariate standard deviation would be
different and perhaps incorrect? Admittedly this is an unusual case, but it
could happen when we look at sub populations of sub populations in a
clinical trials environment.
Thanks.
Venky
venky.chakravarthy@pfizer.com