LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (April 2006)Back to main SPSSX-L pageJoin or leave SPSSX-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Thu, 6 Apr 2006 17:07:12 +0200
Reply-To:   Marta García-Granero <biostatistics@terra.es>
Sender:   "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From:   Marta García-Granero <biostatistics@terra.es>
Organization:   Asesoría Bioestadística
Subject:   Tutorial on nonparametric tests(1)
In-Reply-To:   <20060404130017.19624.qmail@web81310.mail.mud.yahoo.com>
Content-Type:   text/plain; charset=ISO-8859-1

Hi everybody

Now and then I have to answer the same questions concerning non parametric tests. I have started writing a series of tutorials that, after some modifications, will go to Ray's page.

This is the first draft of the first tutorial: "All you wanted to know about Kruskal-Wallis tests in one factor designs: data requirements, correct summary measures & multiple comparisons.

I'd like to receive any suggestions concerning how to improve it. It's quite long, skip the message if you are not interested in the topic, I can't assume the responsability of boring some listers to their deaths ;)

This message is better viewed using Courier font

KRUSKAL-WALLIS TEST

Some knowledge on ONEWAY ANOVA is assumed (what a post-test is, or a contrast, importance of Levene test...)

* Sample dataset *. DATA LIST FREE/actype glucose (2 F8.0). BEGIN DATA 1 51 1 56 1 58 1 60 1 62 1 63 1 65 1 68 1 72 1 73 2 60 2 65 2 66 2 68 2 68 2 69 2 73 2 75 2 78 2 80 3 69 3 73 3 74 3 78 3 79 3 79 3 82 3 85 3 87 3 88 4 70 4 75 4 76 4 77 4 79 4 80 4 82 4 86 4 88 4 89 END DATA. VALUE LABEL actype 1'Control' 2'Respiratory' 3'Metabolic' 4'Mixed'.

Data requirements for Kruskal-Wallis test: distributions similar in shape (this means that dispersion is something to be considered too; see: "Statistical Significance Levels of Nonparametric Tests Biased by Heterogeneous Variances of Treatment Groups" Journal of General Psychology, Oct, 2000 by Donald W. Zimmerman. Available at: http://www.findarticles.com/p/articles/mi_m2405/is_4_127/ai_68025177 )

Also, we need some descriptives too. If data are not normally distributed, then mean & standard deviation are a BAD idea. We'll use median & interquartilic range instead.

* Exploratory Data Analysis (EDA) *. EXAMINE VARIABLES=glucose BY actype /PLOT BOXPLOT SPREADLEVEL(1) /COMPARE GROUP /PERCENTILES (25,50,75) /STATISTICS NONE /NOTOTAL.

* If 95% CI for all group medians are also wanted *. TEMPORARY. COMPUTE one = 1. RATIO STATISTICS glucose WITH one BY actype (ASCENDING) /PRINT = CIN(95) MEDIAN .

Now, Kruskal-Wallis test.

* Kruskal-Wallis test *. NPAR TESTS /K-W=glucose BY actype(1 4).

It is significant, but SPSS doesn't offer post-hoc methods.

MULTIPLE COMPARISON METHODS

(1) Based on pairwise Mann-Whitney's U tests

The number of pairwise comparisons is K(K-1)/2 (being K the number of groups):

N Comp=4·3/2 = 6.

NPAR TESTS /M-W= glucose BY actype(1 2) /M-W= glucose BY actype(1 3) /M-W= glucose BY actype(1 4) /M-W= glucose BY actype(2 3) /M-W= glucose BY actype(2 4) /M-W= glucose BY actype(3 4). /* 6 MW tests *.

(There is a MACRO at Rays's that makes the task authomatic: http://www.spsstools.net/Syntax/T-Test/MultipleMann-WhitneyTests.txt).

Now, get all U statistics and exact p-values (you can use OMS to extract them if SPSS 12 or newer is used):

U P ---------------------------- Cont vs Resp 20.5 0.02323 Cont vs Met 2.5 0.00004 Cont vs Mix 2 0.00004 Resp vs Met 14.5 0.00520 Resp vs Mix 12 0.00288 Met vs Mix 45 0.73936 ----------------------------

(A) Using U statistics: Dwass test, modified by Gabriel (as described by Sokal&Rohlf's "Biometry").

This method is for balanced designs only (n1=n2=...=n) & asymptotic (sample sizes over 20 or so).

Compute U' (n²-U) for all the pairwise comparisons:

U U' ---------------------------- Cont vs Resp 20.5 79.5 Cont vs Met 2.5 97.5 Cont vs Mix 2 98 Resp vs Met 14.5 85.5 Resp vs Mix 12 88 Met vs Mix 45 55 ---------------------------- (n=10; U'=100-U)

Now compute the following critical value:

U(alpha)=n²/2+Q(alpha,k)·n·SQRT((2·n+1)/24) (rounded to nearest half point).

U(0.05)=100/2 + 3.63·10·SQRT(21/24)=83.96 -> 84

All U' GE U(alpha) are significant at alpha level: U' Sig. ---------------------------- Cont vs Resp 79.5 NS Cont vs Met 97.5 * Cont vs Mix 98 * Resp vs Met 85.5 * Resp vs Mix 88 * Met vs Mix 55 NS ----------------------------

(Control = Respiratory) < (Metabolic = Mixed)

Studentized Range critical values for k groups & infinite DF:

k: 2 3 4 5 6 7 8 9 10 11 0.05 2.77 3.31 3.63 3.86 4.03 4.17 4.29 4.39 4.47 4.55 0.01 3.64 4.12 4.40 4.60 4.76 4.88 4.99 5.08 5.16 5.23 k: 12 13 14 15 16 17 18 19 20 0.05 4.62 4.68 4.74 4.80 4.85 4.89 4.93 4.97 5.01 0.01 5.29 5.35 5.40 5.45 5.49 5.54 5.57 5.61 5.65

(B) Using p-values (exact) from MW tests (good for small samples & unbalanced designs)

If left unadjusted, the p-values are LSD (Least significant Difference) comparisons. The probability of at least one type I error increases dramatically with the number of tests (the ExperimentWise Error Rate)

K (N.comp) EWER ------------------ 2( 1) 0.05 3( 3) 0.11 4( 6) 0.21 5(10) 0.30 6(15) 0.39 7(21) 0.47 ------------------ (Source: Bailar&Mosteller "Medical Uses of Statistics" NEJM books, p244).

P-value adjustment methods:

- Bonferroni & Sidak (one step) - Holm, Hollander (Holm-Sidak) & Finner (step-down) - Hommel, Hochberg & Simes (step-up)

And more (Lui #1 & #2 ...)

See: http://www.spsstools.net/Syntax/Unclassified/AdjustedP-ValuesAlgorithms.txt for code to adjust these p-values. The output for this example is:

One-step One-step Rank Nr. Original Bonferroni Sidak ____ ___ ___________ __________ _________

5 1 .0232 .1394 .1315 1 2 .0000 .0002 .0002 2 3 .0000 .0002 .0002 4 4 .0052 .0312 .0308 3 5 .0029 .0173 .0172 6 6 .7394 1.0000 .9997

Step-down Step-down Step-down Rank Nr. Original Holm Sidak Finner ____ ___ ___________ _________ _________ _________

5 1 .0232 .0465 .0459 .0278 1 2 .0000 .0002 .0002 .0002 2 3 .0000 .0002 .0002 .0002 4 4 .0052 .0156 .0155 .0078 3 5 .0029 .0115 .0115 .0058 6 6 .7394 .7394 .7394 .7394

Step-up Step-up Step-up Rank Nr. Original Hommel Hochberg Simes ____ ___ ___________ _________ _________ _________

5 1 .0232 .0683 .0465 .0279 1 2 .0000 .0003 .0002 .0001 2 3 .0000 .0003 .0002 .0001 4 4 .0052 .0191 .0156 .0078 3 5 .0029 .0141 .0115 .0058 6 6 .7394 1.0000 .7394 .7394

Using one-step methods (conservative), results agree with Dwass test. Using step-down methods, like Holm (more sensitive) we get the following:

Control < Respiratory < (Metabolic = Mixed)

(2) Using Kruskal-Wallis Mean Ranks (large samples -asymptotic- methods)

ACTYPE N Mean ranks Control 10 8.00 Respiratory 10 16.10 Metabolic 10 28.30 Mixed 10 29.60 Total 40

All based in the following statistic, asymptotically normal.

|MeanRanki-MeanRankj| Z= --------------------------- ----------------------- \ | 1 1 Nt*(Nt+1)| \|(--- + ---)----------| | Ni Nj 12 |

[se_rank=SQRT((1/n(i)+1/n(j))*nt*(nt+1)/12)]

Unadjusted Z are LSD comparisons. We can use Bonferroni or Sidak adjustment (the so called Dunn & Dunn-Sidak methods), or, again, Studentized Range distribution for a one-step (Tukey-Kramer method) or step-down (S-N-K). The last method is unreliable for unbalanced designs. If only comparisons with a Control group are needed, then Dunnett test can be used.

There is a MATRIX code will calculate Dunnett, LSD, Dunn-Sidak, Tukey-Kramer & SNK tests (available on request). The output is:

KRUSKAL-WALLIS TEST & MULTIPLE COMPARISON METHODS Assumptions: Large samples & distributions similar in shape

Comparisons (adjusted): Dunnett method (1st group is reference) Control vs Respirat NS Control vs Metabol ** Control vs Mixed **

Sorted Mean Ranks (in ascending order) Control 8,00 Respirat 16,10 Metabol 28,30 Mixed 29,60

Comparisons (unadjusted): LSD test Control vs Respirat NS Control vs Metabol ** Control vs Mixed ** Respirat vs Metabol * Respirat vs Mixed ** Metabol vs Mixed NS

Comparisons (adjusted): Dunn-Sidak test Control vs Respirat NS Control vs Metabol ** Control vs Mixed ** Respirat vs Metabol NS Respirat vs Mixed NS Metabol vs Mixed NS

Comparisons (adjusted): Tukey-Kramer method Control vs Respirat NS Control vs Metabol ** Control vs Mixed ** Respirat vs Metabol NS Respirat vs Mixed * Metabol vs Mixed NS

Comparisons (adjusted): Step-down SNK method Control vs Respirat NS Control vs Metabol ** Control vs Mixed ** Respirat vs Metabol * Respirat vs Mixed * Metabol vs Mixed NS

LSD & SNK tests: (Control = Respiratory) < (Metabolic = Mixed) Dunn-Sidak & Tukey tests gives intransitive solutions.

This draft is not complete, I'm working on code for orthogonal contrasts for KW tests, partitioning the total rank sum of squares (Bennett BM (1968) Rank-order test of linear hypothesis. J R Statist Soc, B, 30, 483-9), but it will take a long time, because right now I know how to compute them by hand, but not how to turn that into MATRIX code.

Regards

Marta mailto:biostatistics@terra.es


Back to: Top of message | Previous page | Main SPSSX-L page