| Date: | Tue, 4 Aug 2009 15:24:39 -0400 |
| Reply-To: | Michael Bryce Herrington <mherrin@G.CLEMSON.EDU> |
| Sender: | "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU> |
| From: | Michael Bryce Herrington <mherrin@G.CLEMSON.EDU> |
| Subject: | knotted regression |
| Content-Type: | text/plain; charset=ISO-8859-1 |
Hey,
I need some help correcting bias in a model I am working on. The data set
below represents my estimates and the actuals. The "estimates" value is
found by averaging the estimated percentages for all observations in a small
interval, the actual is the percentage of these observations for which the
event actually occurs. You can see from some of the quick plots I have
included that we have some bias for the very small percentages and large
percentages. I would like to try to use a knotted linear regression to
correct this. I do not know how to do this in SAS. I would play around
with knot locations but would like them around: .07, .17, .26, .4. The only
requirements I have is that the regression equation will be monotonic and
continuous.
"freq" is the number of observations in each interval.
"error" is the difference between "estimate" and "actual."
"relerror" is the relative error between estimate and actual.
Thanks for any help you can provide.
*
data* bias1st;
input estimate actual freq error relerror;
datalines;
0.003473099 0.001885903 2121 0.001587196 0.841610668
0.007699692 0.004836028 6617 0.002863664 0.592151961
0.012613618 0.007609715 9593 0.005003903 0.657567667
0.017529609 0.011300992 10884 0.006228617 0.551156645
0.022516198 0.014978602 11216 0.007537596 0.503224253
0.027490199 0.01824168 11238 0.009248519 0.506999317
0.032494378 0.026367891 11112 0.006126487 0.232346504
0.037478097 0.031193875 10579 0.006284223 0.201456939
0.042442687 0.038597179 10208 0.003845508 0.099631844
0.047477514 0.041547862 9820 0.005929653 0.142718607
0.052488477 0.049382716 9153 0.003105761 0.062891664
0.057477386 0.05993328 8693 -0.002455893 -0.040977122
0.062445126 0.059715753 8373 0.002729373 0.045706082
0.067476529 0.060896638 8030 0.006579891 0.108050159
0.072483682 0.073314939 7611 -0.000831257 -0.011338172
0.077488685 0.071525149 7396 0.005963536 0.083376772
0.082472504 0.082904517 6996 -0.000432013 -0.005210969
0.087492165 0.087288265 6553 0.0002039 0.002335936
0.092462588 0.101198604 6591 -0.008736016 -0.086325462
0.097493616 0.093355209 6095 0.004138407 0.044329684
0.102463295 0.104912573 6005 -0.002449278 -0.023345899
0.107474842 0.106852702 5662 0.00062214 0.005822408
0.112480962 0.115858209 5360 -0.003377247 -0.029149828
0.117490153 0.122850123 5291 -0.00535997 -0.043630158
0.12249581 0.127487685 5075 -0.004991874 -0.039155737
0.127483188 0.121856471 4891 0.005626717 0.046174951
0.132482486 0.134197233 4553 -0.001714746 -0.012777806
0.137455097 0.136908803 4419 0.000546294 0.003990206
0.142506404 0.143632076 4240 -0.001125672 -0.007837187
0.147513757 0.150529687 4059 -0.00301593 -0.020035449
0.152473905 0.14340393 3919 0.009069975 0.063247747
0.157469455 0.164846871 3755 -0.007377416 -0.044753147
0.162451904 0.158624515 3606 0.003827389 0.02412861
0.167452602 0.185663925 3404 -0.018211323 -0.09808757
0.172508639 0.177121771 3252 -0.004613132 -0.026044974
0.177473315 0.187613843 3294 -0.010140529 -0.054050002
0.182450734 0.182781457 3020 -0.000330723 -0.001809389
0.187492855 0.203912467 3016 -0.016419612 -0.080522843
0.192509174 0.193471608 2941 -0.000962434 -0.004974552
0.197483381 0.192614048 2762 0.004869333 0.025280257
0.202506422 0.204271845 2575 -0.001765423 -0.008642517
0.207542271 0.213755373 2559 -0.006213102 -0.029066416
0.212475207 0.209387755 2450 0.003087452 0.01474514
0.217476099 0.213720317 2274 0.003755783 0.017573353
0.222477618 0.219799389 2293 0.002678228 0.012184876
0.227465828 0.22729457 2081 0.000171258 0.000753461
0.232500352 0.235322196 2095 -0.002821843 -0.011991403
0.23747133 0.245235707 1994 -0.007764377 -0.031660875
0.242479857 0.248693835 1914 -0.006213978 -0.024986459
0.247469747 0.25135428 1846 -0.003884532 -0.01545441
0.252450789 0.261408451 1775 -0.008957662 -0.034266917
0.257533381 0.246153846 1690 0.011379535 0.04622936
0.262496508 0.274064991 1631 -0.011568483 -0.042210729
0.267500892 0.2623057 1544 0.005195193 0.019805871
0.272439853 0.302486188 1448 -0.030046335 -0.099331261
0.277444498 0.292439372 1402 -0.014994874 -0.051275154
0.282570342 0.294596595 1351 -0.012026253 -0.040822783
0.28749635 0.288173653 1336 -0.000677303 -0.00235033
0.292506676 0.32304038 1263 -0.030533704 -0.094519776
0.297411313 0.288834952 1236 0.008576361 0.029692948
0.302513699 0.304235091 1157 -0.001721392 -0.005658099
0.307470007 0.336190476 1050 -0.028720469 -0.085429157
0.312516039 0.316420664 1084 -0.003904625 -0.01233998
0.317495583 0.330484331 1053 -0.012988748 -0.039302158
0.322487192 0.351703407 998 -0.029216214 -0.083070604
0.32744287 0.328220859 978 -0.000777989 -0.002370321
0.33250665 0.327922078 924 0.004584572 0.013980675
0.337466867 0.348888889 900 -0.011422022 -0.032738278
0.342527376 0.319201995 802 0.023325381 0.073074045
0.347578264 0.356626506 830 -0.009048242 -0.02537176
0.352457347 0.369093231 783 -0.016635884 -0.04507231
0.357507703 0.358208955 737 -0.000701252 -0.001957663
0.362544663 0.356814701 653 0.005729961 0.016058646
0.367405822 0.35745938 677 0.009946442 0.027825377
0.372553192 0.401976936 607 -0.029423744 -0.073197592
0.377543159 0.382838284 606 -0.005295125 -0.013831231
0.382527939 0.39070568 581 -0.008177741 -0.020930693
0.387463688 0.403441683 523 -0.015977994 -0.039604223
0.39241685 0.391129032 496 0.001287818 0.003292565
0.397407371 0.420408163 490 -0.023000792 -0.054710622
0.41204647 0.426015474 2068 -0.013969004 -0.032789899
0.437048036 0.451093951 1554 -0.014045915 -0.031137449
0.461761062 0.497839239 1157 -0.036078177 -0.072469533
0.486851702 0.519018405 815 -0.032166703 -0.061976036
0.521874933 0.579868709 914 -0.057993776 -0.100011908
0.571919876 0.650574713 435 -0.078654837 -0.120900545
0.64145977 0.731958763 194 -0.090498993 -0.12363947
;
*
run*;
symbol1 v=plus i=none c=blue;
symbol2 v=none i=j c=r;
*
proc* *gplot* data=bias1st;
plot estimate*actual actual*actual/overlay;
*
run*;
plot error*actual;
*
run*;
plot relerror*actual;
*
run*;
--
Bryce Herrington
Clemson University
mherrin@g.clemson.edu
(863) 258-4758
|