Date: Tue, 17 Feb 2009 09:41:25 -0800
Reply-To: "Nordlund, Dan (DSHS/RDA)" <NordlDJ@DSHS.WA.GOV>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "Nordlund, Dan (DSHS/RDA)" <NordlDJ@DSHS.WA.GOV>
Subject: Re: How can caculate the chi-Square in SAS
In-Reply-To: <b7a7fa630902170817x68932708t9e687321b65f05de@mail.gmail.com>
Content-Type: text/plain; charset=iso-8859-1
> -----Original Message-----
> From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On
> Behalf Of Joe Matise
> Sent: Tuesday, February 17, 2009 8:18 AM
> To: SAS-L@LISTSERV.UGA.EDU
> Subject: Re: How can caculate the chi-Square in SAS
>
> I'll jump into this discussion on the 'asking' side ... :)
>
> I am using Chi-Square to test reporting level scores against
> national scores
> (Chi-Square was the direction of the client, so no, I can't
> use T-test or
> others). I have already generated the summary score dataset
> (output from
> PROC MEANS), and appended national scores to the dataset.
> Currently, I
> process this via direct math - ie, I wrote out a macro to do,
> by hand, the
> chi square test (using effective base size and percentage as
> inputs for each
> of the four cells). I imagine the code would be much easier
> to read if I
> can do it not by hand (ie, using PROBCHI or a PROC on the
> full dataset).
>
> I am limited in this largely by my lack of statistical background - I
> understand basic statistics to some extent, but college was a
> long time ago,
> and I'm really more of a programmer than a statistician :)
> PROBCHI takes as
> input the score, the DF (which I can derive from the
> effective base size, I
> believe, something like (FB-2); or does it involve both effective base
> sizes?) and the 'non-centrality' parameter. Am I correct in
> guessing that
> the NC parameter is equivalent to the 'benchmark' score that
> I'm comparing
> it to, or is that not relevant? Also, if I do it that way,
> it sounds like
> the effective base size for the overall group does not
> matter- that feels
> wrong to me given the formula I use, but perhaps it doesn't
> actually matter?
>
> I've also looked at the PROC FREQ options for chi square
> tests, but those
> seem to be roughly the same, and require nonsummarized data
> to compare to,
> which I'd prefer not to do (summarizing this takes hours, and
> there are a
> lot of levels, which PROC FREQ doesn't deal well with, as
> opposed to using
> CLASS)...
>
> I guess my ultimate question is, is it best to just use the
> directly written
> formula still, or is there a superior way using a built in formula?
>
> Thanks!
>
> -Joe
>
> My data, by the way, roughly looks like this:
>
> level1 , level2 , level3 , score1, effbase1, score2 , effbase2
> ,,,.80,150,.70,100
> ,,A,.70,50,.75,40
> ,,B,.90,50,.77,30
> ,,C,.80,50,.60,30
> ,1,A,.75,20,.80,15
> ... etcetera
> which I then appended the first row (the overall numbers) scores and
> effective base sizes to every row below it, to get the
> comparison values.
>
Joe,
I will jump into this on the answering side (sort of). :-) I don't understand yet what the levels, score1, score2, effbase1 and effbase2 represent yet. If you want to show your formula, I can provide further comment on your calculation of chisq.
However, given a chisq value you can use probchi to get a p-value. There is no need to specify the non-centrality parameter. You mention 4 cells, so it sounds like you calculating a chisq for a 2 by 2 table, and therefore degrees of freedom would be 1.
p = 1 - probchi(your_chi,1);
Hope this is helpful,
Dan
Daniel J. Nordlund
Washington State Department of Social and Health Services
Planning, Performance, and Accountability
Research and Data Analysis Division
Olympia, WA 98504-5204
|