Date: Tue, 9 Sep 2003 13:42:58 +0100
Reply-To: Roland <roland@RASHLEIGH-BERRY.FSNET.CO.UK>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Roland <roland@RASHLEIGH-BERRY.FSNET.CO.UK>
Organization: Universe Monitors
Subject: stats question on quartiles
If I have a list of 20 values like this:
_val1=19
_val2=104
_val3=147
_val4=167
_val5=186
_val6=224
_val7=226
_val8=299
_val9=299
_val10=302
_val11=306
_val12=332
_val13=337
_val14=340
_val15=357
_val16=361
_val17=365
_val18=392
_val19=395
_val20=443
then when I do a proc univeriate and look at the Q3 and Q1 values it gives
me Q1=205 and Q3=359.
I would have thought it would be Q1=186 (fifth observation) and Q3=357 (15th
observation). Why is it averaging with the next observation along?
This is the code I ran if you want to copy and paste.
data test(keep=val);
array _val {20} 8 (20*0);
*- assign a random value to all the cells and display -;
*- and calculate the sum and mean -;
sum=0;
min=9**99;
max=0;
ramax=dim(_val);
do i=1 to 20;
_val(i)=floor(ranuni(9)*500);
put _val(i)=;
sum=sum+_val(i);
if _val(i)>max then max=_val(i);
if _val(i)<min then min=_val(i);
end;
mean=sum/dim(_val);
*- bubble sort the values -;
top=ramax;
swapdone=1;
do while(top > 1 and swapdone);
swapdone=0;
do i=1 to (top-1);
if _val(i)>_val(i+1) then do;
store=_val(i+1);
_val(i+1)=_val(i);
_val(i)=store;
swapdone=1;
end;
end;
top=top-1;
end;
*- display the final sorted order and calculate more stats-;
cssq=0;
ussq=0;
do i=1 to 20;
ussq=ussq+_val(i)**2;
cssq=cssq+(_val(i)-mean)**2;
put _val(i)=;
val=_val(i);
output;
end;
var=cssq/(ramax-1);
std=var**0.5;
stdmean=std/ramax**0.5;
covar=std/mean;
range=max-min;
if mod(dim(_val),2) then median=_val(floor(ramax/2));
else median=(_val(ramax/2)+_val(ramax/2+1))/2;
put sum= mean= median= min= max= ussq= cssq= var= std= stdmean= covar=
range=;
run;
proc univariate data=test;
var val;
run;