| Date: | Fri, 25 Sep 2009 22:38:27 -0400 |
| Reply-To: | Wensui Liu <liuwensui@GMAIL.COM> |
| Sender: | "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU> |
| From: | Wensui Liu <liuwensui@GMAIL.COM> |
| Subject: | Re: Variable reduction method |
|
| In-Reply-To: | <17761522.1253912011603.JavaMail.root@mswamui-chipeau.atl.sa.earthlink.net> |
| Content-Type: | text/plain; charset=ISO-8859-1 |
Peter,
Thanks for sharing your paper again.
The paper in the original post is about the variable selection in a
multinomial logit. Even I knew that glmselect procedure couldn't be
applicable to such a logit model. an expert like you should certainly
realize it without reading too much into the paper.
On Fri, Sep 25, 2009 at 4:53 PM, Peter Flom
<peterflomconsulting@mindspring.com> wrote:
> I will try to look at this in detail, maybe this weekend, but a much simpler method is available PROC GLMSELECT offers both
> LASSO and LAR
>
> See the paper David Cassell and I wrote together -
> Stopping Stepwise - Why Stepwise Variable Selection Methods are Bad, and What you Should Use
>
> it's at various places - one SGF, two NESUGs, a NYASUG and either a WUSS or a PNWSUG.
>
> Here is the one from NESUG 2007
>
> www.nesug.org/Proceedings/nesug07/sa/sa07.pdf
>
>
> HTH
>
> Peter
>
> -----Original Message-----
>>From: liudapaul <liudapaul@GMAIL.COM>
>>Sent: Sep 25, 2009 4:30 PM
>>To: SAS-L@LISTSERV.UGA.EDU
>>Subject: Variable reduction method
>>
>>I've been reading a SUGI paper regarding to variable reduction but
>>couldn't figure out one part of the code on page 6. I attach the code
>>below and for anyone who is interested in the paper I also attached
>>the link.
>>
>>*--------------------------------------------------------------*
>>| Loop through all Numeric Variables in Data Set |
>>*--------------------------------------------------------------*;
>>%do range=1 %to &numobs %by 200;
>>%let upper=%eval(&range+199);
>>
>>data varlist;
>>set &inputds._keeplist;
>>if &range<=_n_<=&upper ;
>>run;
>>
>>data _null_;
>>set varlist NOBS=numobs end=last;
>>call symput(cats("var",_n_),variable);
>>if last then call symput("numobs",numobs);
>>run;
>>
>>proc corr data=&inputds. OUTP=%cmpres(corr_&range._&upper) NOPRINT;
>>var %do j=1 %to &numobs; &&var&j %end; &dumlist;
>>%if %bquote(&weight) ne %then %str(weight &weight;);
>>RUN;
>>
>>%put Output Correlation Data Set Will be Called: %cmpres
>>(corr_&range._&upper);
>>
>>ods listing close;
>>data _null_;
>>set varlist end=last;
>>call execute("ods output Test ANOVA=reg_temp(KEEP=MODEL SOURCE DF MS
>>FVALUE PROBF
>>WHERE=(Source='Numerator'));");
>>call execute("proc REG data=%cmpres(corr_&range._&upper) ;");
>>call execute(variable||": model "||variable||"=&dumlist ;");
>>call execute("test &dumlist_comma;");
>>call execute("RUN;");
>>call execute("quit;");
>>call execute("ods output close;");
>>
>>IF EXIST("WORK.REG_TEMP") THEN DO;
>>call execute("Proc APPEND base=%cmpres(work.reg_total_&out_suffix)
>>data=reg_temp(drop=source) force;");
>>call execute("run;");
>>END;
>>RUN;
>>%end;
>>
>>my question is what's the idea behind using a correlation matrix as
>>input of a regression?
>>
>>Link to the paper:
>>http://www2.sas.com/proceedings/forum2007/081-2007.pdf
>
>
> Peter L. Flom, PhD
> Statistical Consultant
> Website: www DOT peterflomconsulting DOT com
> Writing; http://www.associatedcontent.com/user/582880/peter_flom.html
> Twitter: @peterflom
>
--
==============================
WenSui Liu
Blog : statcompute.spaces.live.com
Tough Times Never Last. But Tough People Do. - Robert Schuller
==============================
|