Date: Tue, 9 Sep 2003 14:09:42 +0100 Roland "SAS(r) Discussion" Roland Universe Monitors Re: stats question on quartiles

Thanks.

"Groeneveld, Jim" <jim.groeneveld@VITATRON.COM> wrote in message news:81BFA8F7807F1349AD6C16AD00A1AB9BABCE5F@AMSM1BMSGM01.ent.core.medtronic.com... > Hi Roland, > > There is an even number of values: 20. So a median (P50) would be the mean of _val10 and _val11. Likewise other percentiles may be between two subsequent values and becaome the mean of those two. Q1 is the value below which in this case are 5 values and above which 15; so it is not the 5th value itself, but lies between the 5th and the 6th value. A similar approach applies to Q3. > > Regards - Jim. > > Y. (Jim) Groeneveld MSc > Biostatistician > Vitatron B.V. > Meander 1051 > 6825 MJ Arnhem > The Netherlands > +31/0 26 376 7365; fax 7305 > Jim.Groeneveld@Vitatron.com > www.vitatron.com > > > -----Original Message----- > From: Roland [mailto:roland@RASHLEIGH-BERRY.FSNET.CO.UK] > Sent: Tuesday, September 09, 2003 14:43 > To: SAS-L@LISTSERV.UGA.EDU > Subject: stats question on quartiles > > > If I have a list of 20 values like this: > _val1=19 > _val2=104 > _val3=147 > _val4=167 > _val5=186 > _val6=224 > _val7=226 > _val8=299 > _val9=299 > _val10=302 > _val11=306 > _val12=332 > _val13=337 > _val14=340 > _val15=357 > _val16=361 > _val17=365 > _val18=392 > _val19=395 > _val20=443 > > then when I do a proc univeriate and look at the Q3 and Q1 values it gives > me Q1=205 and Q3=359. > > I would have thought it would be Q1=186 (fifth observation) and Q3=357 (15th > observation). Why is it averaging with the next observation along? > > > > This is the code I ran if you want to copy and paste. > > data test(keep=val); > array _val {20} 8 (20*0); > > *- assign a random value to all the cells and display -; > *- and calculate the sum and mean -; > sum=0; > min=9**99; > max=0; > ramax=dim(_val); > do i=1 to 20; > _val(i)=floor(ranuni(9)*500); > put _val(i)=; > sum=sum+_val(i); > if _val(i)>max then max=_val(i); > if _val(i)<min then min=_val(i); > end; > mean=sum/dim(_val); > > *- bubble sort the values -; > top=ramax; > swapdone=1; > do while(top > 1 and swapdone); > swapdone=0; > do i=1 to (top-1); > if _val(i)>_val(i+1) then do; > store=_val(i+1); > _val(i+1)=_val(i); > _val(i)=store; > swapdone=1; > end; > end; > top=top-1; > end; > > *- display the final sorted order and calculate more stats-; > cssq=0; > ussq=0; > do i=1 to 20; > ussq=ussq+_val(i)**2; > cssq=cssq+(_val(i)-mean)**2; > put _val(i)=; > val=_val(i); > output; > end; > var=cssq/(ramax-1); > std=var**0.5; > stdmean=std/ramax**0.5; > covar=std/mean; > range=max-min; > if mod(dim(_val),2) then median=_val(floor(ramax/2)); > else median=(_val(ramax/2)+_val(ramax/2+1))/2; > put sum= mean= median= min= max= ussq= cssq= var= std= stdmean= covar= > range=; > run; > > proc univariate data=test; > var val; > run;

Back to: Top of message | Previous page | Main SAS-L page