LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous (more recent) messageNext (less recent) messagePrevious (more recent) in topicNext (less recent) in topicPrevious (more recent) by same authorNext (less recent) by same authorPrevious page (August 2009, week 1)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Tue, 4 Aug 2009 17:19:15 -0400
Reply-To:     Michael Bryce Herrington <mherrin@G.CLEMSON.EDU>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Michael Bryce Herrington <mherrin@G.CLEMSON.EDU>
Subject:      Re: knotted regression
Comments: To: Robin R High <rhigh@unmc.edu>
In-Reply-To:  <OF0CD55A00.5733F0A5-ON86257608.0071B656-86257608.0071F594@unmc.edu>
Content-Type: text/plain; charset=ISO-8859-1

WOW, thanks that does exactly what I want. Now I just have to figure out HOW it is doing it. :) Thank you!

On Tue, Aug 4, 2009 at 4:44 PM, Robin R High <rhigh@unmc.edu> wrote:

> > Bryce, > > Ignoring the variable 'freq', here is a method to connect line segments at > a specified number of knots (see Chapter 3 of "Semiparametric Regression" by > Ruppert, Ward, and Carol) for two variables of interest. The process here > produces a special x matrix which then computes predicted values that > intersect at (or close to) the knots. > > > %LET y=relerror; ** vertical axis variable;* > %LET x=actual; ** horizontal axis;* > %LET knts = .07 .15 .26 .4; ** knot locations;* > %LET nnts = 4; ** number of knots;* > > *DATA* indat; SET bias1st; DROP ii jj ; > ARRAY rng{%EVAL(&nnts.+*1*)} x1-x%EVAL(&nnts.+*1*) ; > ARRAY knt{&nnts.} _temporary_ (&knts.); > x1=&x.; > DO jj = *2* to %EVAL(&nnts.+*1*); > rng{jj} =(&x. -knt{jj-*1*})*(&x. ge knt{jj-*1*}); > END; > *RUN*; > > *PROC* *PRINT* DATA=indat; options ps=*199* ls=*132*; *run*; > > > in essence, you want to produce an x matrix below, where each successive > column of x2 .. x5 takes on (x1-knot_k) when x1 > knot_k > > < many rows deleted > > > Obs actual error prd x1 x2 x3 x4 > x5 > > 1 0.00189 0.001587 0.63280 0.002 0 0 0 > 0 > 13 0.05972 0.002729 0.04837 0.060 0 0 0 > 0 > 18 0.08729 0 204 -0.04366 0.087 0.017 0 0 > 0 knot_1 = .07 > 23 0.11586 -0.003377 -0.02399 0.116 0.046 0 0 > 0 > 26 0.12186 0.005627 -0.01986 0.122 0.052 0 0 > 0 > 35 0.17712 -0.004613 -0.00462 0.177 0.107 0.027 0 > 0 knot_2 = .15 > 38 0.20391 -0.016420 -0.00872 0.204 0.134 0.054 0 > 0 > 52 0.24615 0.011380 -0.01519 0.246 0.176 0.096 0 > 0 > 60 0.28883 0.008576 -0.01995 0.289 0.219 0.139 0.029 > 0 knot_3 = .26 > 76 0.38284 -0.005295 -0.02856 0.383 0.313 0.233 0.123 > 0 > 79 0.39113 0.001288 -0.02932 0.391 0.321 0.241 0.131 > 0 > 80 0.42041 -0.023001 -0.03671 0.420 0.350 0.270 0.160 > 0.020 knot_4 = .4 > 82 0.45109 -0.014046 -0.04660 0.451 0.381 0.301 0.191 > 0.051 > 87 0.73196 -0.090499 -0.13712 0.732 0.662 0.582 0.472 > 0.332 > > > then fit a model.. > K+1 > f(x) = b0 + b1*x1 + SUM b_k*(x_k - knot_k)+ > 2 > > where K = number of knots > > If you enter 5 knots, then the DATA step produces 6 x columns. > > * compute predicted values; > > *PROC* *REG* data=indat; > MODEL &y. = x1-x%EVAL(&nnts.+*1*) ; > OUTPUT out=rmns pred=prd; > *run*; *quit*; > > *PROC* *PRINT* DATA=rmns(where=(ranuni(*929*)> *.5*)); > VAR &x. &y. prd x1-x%eval(&nnts.+*1*) ; > format x: *prd 5.3*; > *run*; > > goptions reset=all; > > symbol1 v=dot i=none color=blue h=*1*; > symbol2 v=none i=join color=black line=*1* w=*2*; > > *proc* *gplot* data=rmns ; > plot &y.*&x.=*1* prd*&x.=*2* / noframe overlay haxis = *0* to *1* by *.1* > hm=*1* href=(&knts.) lhref=*33*; > *run* ; *quit*; > > > Robin High > UNMC > > > > > > *Michael Bryce Herrington <mherrin@G.CLEMSON.EDU>* > Sent by: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU> > > 08/04/2009 02:25 PM Please respond to > Michael Bryce Herrington <mherrin@G.CLEMSON.EDU> > > To > SAS-L@LISTSERV.UGA.EDU cc > Subject > knotted regression > > > > > Hey, > > I need some help correcting bias in a model I am working on. The data set > below represents my estimates and the actuals. The "estimates" value is > found by averaging the estimated percentages for all observations in a > small > interval, the actual is the percentage of these observations for which the > event actually occurs. You can see from some of the quick plots I have > included that we have some bias for the very small percentages and large > percentages. I would like to try to use a knotted linear regression to > correct this. I do not know how to do this in SAS. I would play around > with knot locations but would like them around: .07, .17, .26, .4. The > only > requirements I have is that the regression equation will be monotonic and > continuous. > > "freq" is the number of observations in each interval. > "error" is the difference between "estimate" and "actual." > "relerror" is the relative error between estimate and actual. > > Thanks for any help you can provide. > > * > > data* bias1st; > > input estimate actual freq error relerror; > > datalines; > > 0.003473099 0.001885903 2121 0.001587196 0.841610668 > > 0.007699692 0.004836028 6617 0.002863664 0.592151961 > > 0.012613618 0.007609715 9593 0.005003903 0.657567667 > > 0.017529609 0.011300992 10884 0.006228617 0.551156645 > > 0.022516198 0.014978602 11216 0.007537596 0.503224253 > > > .... > > > 0.486851702 0.519018405 815 -0.032166703 -0.061976036 > > 0.521874933 0.579868709 914 -0.057993776 -0.100011908 > > 0.571919876 0.650574713 435 -0.078654837 -0.120900545 > > 0.64145977 0.731958763 194 -0.090498993 -0.12363947 > > ; > * > > run*; > > symbol1 v=plus i=none c=blue; > > symbol2 v=none i=j c=r; > * > > proc* *gplot* data=bias1st; > > plot estimate*actual actual*actual/overlay; > * > > run*; > > plot error*actual; > * > > run*; > > plot relerror*actual; > * > > run*; > > > -- > Bryce Herrington > Clemson University > mherrin@g.clemson.edu > (863) 258-4758 > >

-- Bryce Herrington Clemson University mherrin@g.clemson.edu (863) 258-4758


Back to: Top of message | Previous page | Main SAS-L page