LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (December 2003, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Tue, 9 Dec 2003 15:15:10 -0500
Reply-To:     "DePuy, Venita" <depuy001@DCRI.DUKE.EDU>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         "DePuy, Venita" <depuy001@DCRI.DUKE.EDU>
Subject:      Re: Wilcoxon normal- and t-approxiamtion
Comments: To: louise <louise@SELSKABET.ORG>
Content-Type: text/plain

Hi Louise:

The difference between the z and t approximations in the Npar1way output is the continuity correction (included in z, not in t). That's basically because you're using the large sample approximation to normality (a continuous distribution) but starting with a discrete distribution; the continuity correction adds a little in to make up for that. The correction decreases the numerator, making the outcome more conservative. Personal experience - the difference between the z and t is usually < .01. Also, many packages don't offer the continuity correction. One way around it is to choose the 'exact' option, which is VERY computationally intensive but doesn't use any approximations. Only for very small sample sizes though!

Wilcoxon assumptions: There are three primary assumptions of the Wilcoxon-Mann-Whitney test: 1) Each sample is randomly selected from the specific population, and the observations within each sample are independent and identically distributed.

2) The two samples are continuous and independent of each other. (If populations are not independent, consider Wilcoxon signed rank test). 3) The populations may differ in their location (mean or median), but not in their distributional shape or spread. (If this assumption is questionable, consider the Lepage or Kolmogorov-Smirnov tests).

Also, a key point - the spreads (variances) of the two populations NEED to be the same or really similar; if they're not, the t test may be a better option. (D. Zimmerman has several papers published along these lines).

Model details - For exact calculations: To compute the test statistic W, the combined sample of N = m + n X-values and Y-values are ordered from least to greatest. Let S1 be the rank of the lowest Y value, Y1, and Sn shall denote the rank of the highest Y value, Yn. Any tied observations shall receive equal average values; for example, if the third and fourth observations have the same value, they both receive the rank of 3.5. W is the sum of the ranks assigned to the Y values. That number is compared to a specific number, usually from tables in Hollander&Wolfe or other texts if doing it by hand. SAS uses an algorithm to generate the numbers and get p values.

Large Sample Approximations: <<...OLE_Obj...>> and then W* is normally distributed; compare to z(alpha) or z(alpha/2) depending on hypotheses to be tested. (This formula does not include corrections for ties or continuity corrections).

Hope this helps, let me know if there's details I haven't provided. (Just finished writing an article on this topic). -Venita

> ---------- > From: louise[SMTP:louise@SELSKABET.ORG] > Reply To: louise > Sent: Tuesday, December 09, 2003 2:53 PM > To: SAS-L@LISTSERV.UGA.EDU > Subject: Wilcoxon normal- and t-approxiamtion > > Hi, > Using Wilcoxons test I have trouble in choosing normal approximation > or t-approxiamtion as the correct. What exactly do the approximations > represent? Please give an anwser full of assumptions and > model/algorithm details. > Sincerely Louise >


Back to: Top of message | Previous page | Main SAS-L page