Date: Mon, 24 Jul 2006 16:29:57 -0400
Reply-To: Wensui Liu <liuwensui@GMAIL.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Wensui Liu <liuwensui@GMAIL.COM>
Subject: Re: proc discrim vs. proc logistic or else
In-Reply-To: <1153770881.452322.111940@h48g2000cwc.googlegroups.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
sophe,
I think there is an option in proc logistic to specify priop
probability of each class in model statement, something like pprob or
pevent. I can't remember exactly.
One way to boosting the class with small prob is to give more weight
to cases with small class.
if you are looking for solution outside SAS, take a look at random
forest by brieman.
On 7/24/06, sophe88@yahoo.com <sophe88@yahoo.com> wrote:
> Hi,
>
> I am building a classification model to separate 5 distinctive values
> 1-5 in depvar. They are nominal.
>
> They are not distributed evenly. 1=0.17, 2=0.12, 3=0.17, 4=0.0305 and
> 5=0.4991.
>
> I tested with proc logistic with different link functions, to no
> available. The best case is: total correct classification is about
> 50%, but the classification along these subgroups are 19%, 22%, 17%, 3%
> and 60%. The big problem with proc logistic is they all give even size
> 5 groups.
>
> Now proc discrim. I used pool=yes and npar method and tweaked radius. I
> am able to control the sub groupsizes much better that proc logistic.
> But still the individual correct classification rates are not good,
> especially value=4 is very low (about 3%).
>
> Now i don't know which way to go next
>
> 1. Is there option in proc logistic that we can set up so that the
> program will customize sizes to the distribution in the original
> depvar, like the prior statement we use in proc discrim?
>
> 2. How do we maintain the % the low % category such as 4 in proc
> discrim?
>
> The reason I am so crazy about boosting individual correct
> classification rates is that if I can not put them above at 20%, then I
> can not even beat a random selection.
>
> I am trying Salford's TreeNet now. I am also thinking about cutting
> many CHAID trees.
>
> Any suggestion or clue is greately appreciated.
>
> PD
>
--
WenSui Liu
(http://spaces.msn.com/statcompute/blog)
Senior Decision Support Analyst
Health Policy and Clinical Effectiveness
Cincinnati Children Hospital Medical Center
|