|
Thanks.
"Groeneveld, Jim" <jim.groeneveld@VITATRON.COM> wrote in message
news:81BFA8F7807F1349AD6C16AD00A1AB9BABCE5F@AMSM1BMSGM01.ent.core.medtronic.com...
> Hi Roland,
>
> There is an even number of values: 20. So a median (P50) would be the mean
of _val10 and _val11. Likewise other percentiles may be between two
subsequent values and becaome the mean of those two. Q1 is the value below
which in this case are 5 values and above which 15; so it is not the 5th
value itself, but lies between the 5th and the 6th value. A similar approach
applies to Q3.
>
> Regards - Jim.
>
> Y. (Jim) Groeneveld MSc
> Biostatistician
> Vitatron B.V.
> Meander 1051
> 6825 MJ Arnhem
> The Netherlands
> +31/0 26 376 7365; fax 7305
> Jim.Groeneveld@Vitatron.com
> www.vitatron.com
>
>
> -----Original Message-----
> From: Roland [mailto:roland@RASHLEIGH-BERRY.FSNET.CO.UK]
> Sent: Tuesday, September 09, 2003 14:43
> To: SAS-L@LISTSERV.UGA.EDU
> Subject: stats question on quartiles
>
>
> If I have a list of 20 values like this:
> _val1=19
> _val2=104
> _val3=147
> _val4=167
> _val5=186
> _val6=224
> _val7=226
> _val8=299
> _val9=299
> _val10=302
> _val11=306
> _val12=332
> _val13=337
> _val14=340
> _val15=357
> _val16=361
> _val17=365
> _val18=392
> _val19=395
> _val20=443
>
> then when I do a proc univeriate and look at the Q3 and Q1 values it gives
> me Q1=205 and Q3=359.
>
> I would have thought it would be Q1=186 (fifth observation) and Q3=357
(15th
> observation). Why is it averaging with the next observation along?
>
>
>
> This is the code I ran if you want to copy and paste.
>
> data test(keep=val);
> array _val {20} 8 (20*0);
>
> *- assign a random value to all the cells and display -;
> *- and calculate the sum and mean -;
> sum=0;
> min=9**99;
> max=0;
> ramax=dim(_val);
> do i=1 to 20;
> _val(i)=floor(ranuni(9)*500);
> put _val(i)=;
> sum=sum+_val(i);
> if _val(i)>max then max=_val(i);
> if _val(i)<min then min=_val(i);
> end;
> mean=sum/dim(_val);
>
> *- bubble sort the values -;
> top=ramax;
> swapdone=1;
> do while(top > 1 and swapdone);
> swapdone=0;
> do i=1 to (top-1);
> if _val(i)>_val(i+1) then do;
> store=_val(i+1);
> _val(i+1)=_val(i);
> _val(i)=store;
> swapdone=1;
> end;
> end;
> top=top-1;
> end;
>
> *- display the final sorted order and calculate more stats-;
> cssq=0;
> ussq=0;
> do i=1 to 20;
> ussq=ussq+_val(i)**2;
> cssq=cssq+(_val(i)-mean)**2;
> put _val(i)=;
> val=_val(i);
> output;
> end;
> var=cssq/(ramax-1);
> std=var**0.5;
> stdmean=std/ramax**0.5;
> covar=std/mean;
> range=max-min;
> if mod(dim(_val),2) then median=_val(floor(ramax/2));
> else median=(_val(ramax/2)+_val(ramax/2+1))/2;
> put sum= mean= median= min= max= ussq= cssq= var= std= stdmean= covar=
> range=;
> run;
>
> proc univariate data=test;
> var val;
> run;
|