Date: Mon, 7 Oct 2002 12:15:02 -0700
Reply-To: Pat Flickner <p_flick@YAHOO.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Pat Flickner <p_flick@YAHOO.COM>
Organization: http://groups.google.com/
Subject: Re: t-test
Content-Type: text/plain; charset=ISO-8859-1
I wish to make a correction to your statements. You have created
macros by placing a %macro and %mend statement before and after some
SAS code. However, these are not truly macros since there are no
macro statements, only macrovariables. You may use macrovariables
without creating a macro. This is a common, and costly, mistake made
by many programmers.
I say costly because, depending upon the platform, after 100,000
records, the cpu time rises dramatically as opposed to an include. On
a pc, the results are almost negligible, but on unix and the
mainframe, there is a sizeable difference.
If you wish to make the macros truly macros, you can do the following:
%macro ttest_variance(s1,n1,s2,n2,alpha);
/* Test the null hypothesis that the variances are equal in 2 groups
*/
data new2(keep=F p H_0);
/*=================================================*/
/* using the %if causes only the code that meets */
/* the conditions to be printed. You can see this */
/* by adding "options mprint;" before executing */
/* the macro. */
/*=================================================*/
%if &s1 > &s2 %then %do;
large = &s1;
small = &s2;
df_large = &n1 - 1;
df_small = &n2 - 1;
%end;
%else do;
large = &s2;
small = &s1;
df_large = &n2 -1;
df_small = &n1 - 1;
%end;
F = large*large/(small*small);
p = 2 * (1 - probf(F,df_large,df_small));
if p < &alpha then do;
call symput ('HO','0');
H_0 = 'Variances are equal'; /* reject null hypothesis */
end;
else do;
H_0 = 'Variances are not equal'; /* Do not reject null hypothesis
*/
call symput ('HO','1');
end;
call symput('F', put(F,best.));
call symput('p', put(p,best.));
run;
title "Variance T-test Resules";
proc print data=new2;
run;
%mend;
You will note two things that I have done: 1) I have combined two
datasteps into one, which is faster and more efficient; and 2) for the
call symputs, I placed the 0 and 1 in quotes. If you do not do this,
the results are as follows:
1 data _null_;
2 call symput ('HO',1);
3 call symput ('HP','1');
4 run;
NOTE: Numeric values have been converted to character
values at the places given by: (Line):(Column).
2:21
NOTE: DATA statement used:
real time 0.00 seconds
cpu time 0.00 seconds
5
6 %put HO=&HO;
HO= 1
7 %put HP=&HP;
HP=1
As you can see, if you try to use "%if &HO=1" the result will always
be false if the numeric is not entered in quotes because, in this
case, HO doesn't = "1," it equals " 1."
If, on the other hand, you say "%let HO = 1;" it will be equal 1, but
the way you have your code set up, that wouldn't be a good idea to use
it here.
The first change I made is because inexperienced SAS programmers tend
to do one datastep for each thing they want to do. This is not a bad
thing. It's just inefficient, especially when you start dealing with
large amounts of data as I do regularly.
Take care, and hope this helps.
Kind regards,
Pat Flickner
kaznish@yahoo.com (Kazume Nishiyama) wrote in message news:<98c10ec7.0210061825.46d448c9@posting.google.com>...
> Hi, Everyone:
>
> It's such an interesting problem that I wrote macros for it, following
> John Whittington's suggestion.
>
> There are 4 macros:
> ttest_variance = test for equal variance
> ttest_equalv = test for equal mean given equal variance
> ttest_unequalv = test for equal mean given unequal variance.
> ttest = complete test. it calls the 3 macros above.
>
> I wonder if anyone would like to verify them and/or give me
> suggestions to improve the codes.
>
> Thanks
> kazume
>
> **** TTEST MACROS **************
> %macro ttest_variance(s1,n1,s2,n2,alpha);
> /* Test the null hypothesis that the variances are equal in 2 groups
> */
> data new;
> if &s1 > &s2 then do
> large = &s1;
> small = &s2;
> df_large = &n1 - 1;
> df_small = &n2 - 1;
> end;
> else do;
> large = &s2;
> small = &s1;
> df_large = &n2 -1;
> df_small = &n1 - 1;
> end;
> F = large*large/(small*small);
> p = 2 * (1 - probf(F,df_large,df_small));
> if p < &alpha then
> call symput('H0',0);
> /* reject null hypothesis */
> else
> call symput('H0',1);
> /* Do not reject null hypothesis */
> run;
> data new2(keep=F p H_0); set new;
> call symput('F', put(F,best.));
> call symput('p', put(p,best.));
> if &H0 = 1 then H_0 = 'Variances are equal';
> else H_0 = 'Variances are not equal';
> run;
> proc datasets nolist;
> delete new;
> run;quit;
> title "Variance T-test Resules";
> proc print data=new2;run;
> %mend;
>
>
> %macro ttest_equalv(m1, s1, n1, m2, s2, n2);
> /* T-Test to test the hypothesis that the means of 2 groups are the
> same
> when the variances are equal */
> data new;
> ssqr = ((&n1 - 1)*&s1*&s1 + (&n2 - 1)*&s2*&s2)/(&n1+&n2-2);
> t = (&m1 - &m2)/sqrt(ssqr*(1/&n1+1/&n2));
> df = &n1 + &n2 -2 ;
> p = 2 * (1 - probt(t, df)) ; /* omit 2* for one-sided */
> run;
> data _null_;set new;
> call symput('df_v',put(df,best.));
> call symput('p_v',put(p,best.));
> call symput('t_v',put(t,best.));
> run;
> proc datasets nolist;
> delete new;
> run; quit;
> %put df=&df_v t_val = &t_v p_val=&p_v;
> %mend;
> %macro ttest_unequalv(m1,s1,n1,m2,s2,n2);
> /* T-Test to test the hypothesis that the means of the 2 groups are
> the same
> when the variances are unequal */
> data new;
> t = (&m1 - &m2) / sqrt(&s1*&s1/&n1 + &s2*&s2/&n2);
> numerator = (&s1*&s1/&n1)+(&s2*&s2/&n2);
> denominator = (&s1*&s1/&n1)*(&s1*&s1/&n1)/(&n1-1)
> + (&s2*&s2/&n2)*(&s2*&s2/&n2)/(&n2 -1);
> df = numerator*numerator / denominator;
> p = 2*(1 - probt(t,df));
> run;
> /* proc print data=new;run; */
> data _null_; set new;
> call symput('df_v',put(df,best.));
> call symput('p_v',put(p,best.));
> call symput('t_v',put(t,best.));
> run;
> proc datasets nolist;
> delete new;
> run;
> %put df=&df_v t_val = &t_v p_val=&p_v;
>
> %mend;
> %macro ttest(m1,s1,n1,m2,s2,n2,alpha);
> %global df_v t_v p_v;
> %ttest_variance(&s1,&n1,&s2,&n2,&alpha);
> %if %eval(&H0) %then %do;
> %ttest_equalv(&m1,&s1,&n1,&m2,&s2,&n2);
> %put H0 = &H0 : H0(Variances are the same) is not rejected.;
> data ttestout;
> length method $15.;
> method = 'Pooled';
> Variance = 'Equal';
> df = input("&df_v",best.);
> t_val = input("&t_v",best.);
> p_val = input("&p_v",best.);
> run;
> %end;
> %else %do;
> %ttest_unequalv(&m1,&s1,&n1,&m2,&s2,&n2);
> %put H0 = &H0 : H0(Variances are the same) is rejected.;
> data ttestout;
> length method $15.;
> method = 'Satterthwaite';
> Variance = 'Unequal';
> df = input("&df_v",best.);
> t_val = input("&t_v",best.);
> p_val = input("&p_v",best.);
> run;
> %end;
> title "Mean T-Test Results";
> proc print data=ttestout;run;
> %mend;
>
>
>
> "Arthur Tabachneck" <atabachneck@rogers.com> wrote in message news:<18Nn9.191145$8b1.108356@news01.bloor.is.net.cable.rogers.com>...
> > Ahmed,
> >
> > Other than the initial data statement (which doesn't appear to serve any
> > purpose), the model looks okay. In fact, I tried it with some test data and
> > my output included both t-tests and p values.
> >
> > Did you read the log. If you have an incomplete model, or SAS doesn't
> > complete processing for any reason (e.g., insufficient data), it may not
> > print inappropriate t-tests and p values.
> >
> > Art
|