LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (January 2002, week 5)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Tue, 29 Jan 2002 19:57:03 +0100
Reply-To:     Dave Sorensen <Dave.Sorensen@JUR.KU.DK>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Dave Sorensen <Dave.Sorensen@JUR.KU.DK>
Subject:      SV: SV: POWER, dummy variable cell size, and logistic regression
Comments: To: Dale McLerran <stringplayer_2@yahoo.com>
Content-Type: text/plain; charset="iso-8859-1"

Dale,

I can see that this is an example of where statistics is part art/part science. I guess I should not get too hung up on the difference between p=0.48 and p=0.60, especially if the cell size for the latter estimate is small and there is good theory to support a relationship.

Thanks for sharing your wisdom once again,

Dave

-----Oprindelig meddelelse----- Fra: Dale McLerran [mailto:stringplayer_2@yahoo.com] Sendt: Tuesday, January 29, 2002 6:38 PM Til: Dave Sorensen; SAS-L@LISTSERV.UGA.EDU Emne: Re: SV: POWER, dummy variable cell size, and logistic regression

Dave,

The problem you have here is one reason why some folks advocate presentation of the point estimate and confidence interval rather than a p-value. This is quite common in the field of epidemiology. By showing the "distribution of probable values" of the parameter estimate, the point is made that there may be a considerable probability that the parameter is indeed not equal to zero. But for the classic statistical formulation, it is unfortunate to have to say that there is no way, short of collecting more data (you don't need to finish your dissertation anytime soon, do you?) that you can obtain a "significant" p-value for the divorced effect in your model. Even though the point estimates in your mock example are exactly the same, they are based on different sample size. To keep type I error (the probability of rejecting the null hypothesis when the null hypothesis is true) fixed at alpha=0.05, you have to take a penalty on the type II error (the probability of failing to reject the null hypothesis when the null hypothesis is false).

I have always been of the opinion that it really does not matter whether one presents point estimate, standard error, and p-value, or point estimate and confidence interval. The point estimate and confidence interval and point estimate and p-value are both invertible functions: if you know the values for one of these functions, you know the values for the other function. Thus, I don't believe that it really gains you anything to present the point estimate and confidence interval.

Dale

--- Dave Sorensen <Dave.Sorensen@jur.ku.dk> wrote: > Dale, > > Nice to hear from you again. I think you got my point, but let me > just > reiterate by way of an absurd example. Here's a mock dataset, N=250, > with > two variables: Marital Status (STATUS) and criminal particpation > (CP). The > table below shows frequency for marital status categories, and the > Criminal > Particpation (CP) rates associated with each status. Note that while > the > SINGLE subjects have a CP rate=40%, all other groups have a CP > rate=60%. > And all marital status groups have different cell sizes. > > Sample > STATUS N % CP RATE > Single 100 40% 0.40 > Married 75 30% 0.60 > Separated 50 20% 0.60 > Divorced 25 10% 0.60 > > I make dummies out of Marital Status, and run: CP= Mar + Sep + Div > (with > single left out as the ref). > > The LOGISTIC Procedure > Analysis of Maximum Likelihood Estimates > Standard Wald > Parameter DF Estimate Error Chi-Square > Pr > > ChiSq Wald 95% Confidence Limits > Intercept 1 -0.4055 0.2041 3.9456 > 0.0470 -0.8055 -0.00539 > Mar 1 0.8109 0.3118 6.7639 > 0.0093 0.1998 1.4221 > Sep 1 0.8109 0.3536 5.2608 > 0.0218 0.1180 1.5039 > Div 1 0.8109 0.4564 3.1565 > 0.0756 -0.0837 1.7055 > > Even though the parameter estimates are identical, only the p-values > for MAR > and SEP are <0.05. Given this absurd mock data, is it correct for me > to > report a signficant effect for MAR and SEP (as compared to the > reference), > but a non-signficant effect for DIV? This seems strange. And the > reason > that I am pursuing this is because my primary interest is precisely > in the > difference between SEP and DIV (when each are compared to the > reference > category). > > Thanks again, > Dave >

===== --------------------------------------- Dale McLerran Fred Hutchinson Cancer Research Center mailto: dmclerra@fhcrc.org Ph: (206) 667-2926 Fax: (206) 667-5977 ---------------------------------------

__________________________________________________ Do You Yahoo!? Great stuff seeking new owners in Yahoo! Auctions! http://auctions.yahoo.com


Back to: Top of message | Previous page | Main SAS-L page