LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (December 2008, week 3)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Tue, 16 Dec 2008 16:29:36 -0500
Reply-To:     "Simon, Lorna" <Lorna.Simon@UMASSMED.EDU>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         "Simon, Lorna" <Lorna.Simon@UMASSMED.EDU>
Subject:      Setting up data to do mixed proceedure
Content-Type: text/plain; charset=us-ascii

I am trying to create a dataset to use with the mixed proceedure. As far as I understand it, you need to have several lines of data for each person, like this: Person gender age totalcosts 1 F 21 2000 1 F 21 300 2 M 40 0 3 M 35 100 4 F 25 200 4 F 25 0 4 F 25 2000

I have a dataset which contains 1 line of data for each person for the initial interview, and A dataset with several lines of data for each of the follow-up interviews.

So, to get the data into the form I need, I first merge the initial dataset with the follow-up dataset to get all of the independent variables onto each line of data in the follow-up dataset. Here is my log:

276 data followup_full; 277 merge hhg.initial_interview2(keep=clientid gender no_health_insurance substance_abuse racew racebl raceas raceai 277! racenh housing_type time_homeless age) 278 hhg.follow_up_interview2(keep=clientid interview__ er_visits hospitalization_days ambulance_times 278! McInnis_House_days 279 detox_days shelter_nights incaceration_days); 280 by clientid; 281 run;

NOTE: There were 180 observations read from the data set HHG.INITIAL_INTERVIEW2. NOTE: There were 1631 observations read from the data set HHG.FOLLOW_UP_INTERVIEW2. NOTE: The data set WORK.FOLLOWUP_FULL has 1646 observations and 20 variables. NOTE: DATA statement used (Total process time): real time 0.26 seconds cpu time 0.03 seconds

Then I set the initial interview data along with the follow-up data created in my 1st datastep, and perform some calculations needed for the mixed procedure. The log follows: 284 data hhg.init_fu_mixed_costs; 285 set followup_full 286 hhg.initial_interview2 (rename=(er_visits1=er_visits hospitalization_days1=hospitalization_days 287 ambulance_times1=ambulance_times McInnis_House_days1=McInnis_House_days 288 detox_center_days1=detox_days er_shelter_nights1=shelter_days 289 incaceration_nights1=incaceration_days)); 290 ercost=640; 291 hospcost=1895; 292 ambulancecost=230; 293 respitecost=400; 294 detoxcost=198; 295 sheltercost=32; 296 jailcost=118; 297 298 array numbers {*} er_visits hospitalization_days ambulance_times shelter_days incaceration_days detox_days 298! McInnis_house_days; 299 array costs {*} ercost hospcost ambulancecost sheltercost jailcost detoxcost respitecost; 300 array totcosts {*} tercost thospcost tambulancecost tsheltercost tjailcost tdetoxcost trespitecost; 301 do i=1 to dim(numbers); 302 totcosts{i}=numbers{i}*costs{i}; 303 end; 304 305 totalcost=sum(tercost, thospcost, tambulancecost, tsheltercost, tjailcost, tdetoxcost, trespitecost); 306 307 if raceai=1 or raceas=1 or racebl=1 or racenh=1 then white=0; 308 else if raceai=0 and raceas=0 and racebl=0 and racenh=0 and racew=1 then white=1; 309 else white=.; 310 311 if gender="Male" then male=1; 312 else if gender="Female" then male=0; 313 else gender=" "; 314 315 if housing_type=1 then scattered_housing=1; 316 else if housing_type=. then scattered_housing=.; 317 else scattered_housing=0; 318

NOTE: Missing values were generated as a result of performing an operation on missing values. Each place is given by: (Number of times) at (Line):(Column). 1832 at 302:27 29 at 305:11 NOTE: There were 1646 observations read from the data set WORK.FOLLOWUP_FULL. NOTE: There were 180 observations read from the data set HHG.INITIAL_INTERVIEW2. NOTE: The data set HHG.INIT_FU_MIXED_COSTS has 1826 observations and 97 variables. NOTE: DATA statement used (Total process time): real time 0.53 seconds cpu time 0.03 seconds

I get the resulting dataset: Obs clientid Interview__ totalcost male insurance abuse white housing homeless age

1765 B006 296 30804 1 0 0 1 1 312 48 1766 B006 2036 18990 1 0 0 1 1 312 48 1767 B006 2402 15615 1 0 0 1 1 312 48 1768 B007 297 17500 1 0 0 1 0 60 59 1769 B007 2426 1600 1 0 0 1 0 60 59 1770 B008 . . 0 0 1 1 0 72 41 1771 B008 298 5274 0 0 1 1 0 72 41 1772 B009 . . 0 0 1 1 0 180 47 1773 B009 304 22227 0 0 1 1 0 180 47 1774 B010 311 12740 1 0 1 1 0 84 43 1775 B010 2227 12000 1 0 1 1 0 84 43 1776 B012 318 14000 1 0 0 1 1 108 55 1777 B012 2440 7596 1 0 0 1 1 108 55 1778 B013 319 15988 1 0 1 1 1 228 50 1779 B013 2403 0 1 0 1 1 1 228 50 1780 B014 . . 1 0 0 1 1 360 59 1781 B014 339 74050 1 0 0 1 1 360 59 1782 B016 . . 1 0 1 1 1 144 59 1783 B016 340 39712 1 0 1 1 1 144 59 1784 B017 . . 1 0 1 1 1 168 53 1785 B017 341 69660 1 0 1 1 1 168 53 1786 BH00 257 1088 1 0 0 1 1 24 23 1787 BH00 1689 0 1 0 0 1 1 24 23 1788 BH00 1807 0 1 0 0 1 1 24 23 1789 BH00 2023 0 1 0 0 1 1 24 23 1790 BH00 2104 0 1 0 0 1 1 24 23 1791 BH00 2154 0 1 0 0 1 1 24 23 1792 BH00 2384 0 1 0 0 1 1 24 23 1793 BH00 2482 0 1 0 0 1 1 24 23 1794 BH03 259 0 1 0 1 1 1 312 46 1795 BH03 1809 0 1 0 1 1 1 312 46 1796 BH03 1810 0 1 0 1 1 1 312 46 1797 BH03 1811 0 1 0 1 1 1 312 46 1798 BH03 2025 0 1 0 1 1 1 312 46 1799 BH03 2026 0 1 0 1 1 1 312 46 1800 BH03 2107 2026 1 0 1 1 1 312 46

The variable interview__ (interview number) is not in the dataset for the initial interview, so the interview number for the first observation for each person should be missing. As you can see, for some people it is missing, for others it is not.

I hope this is clear. Can anyone figure out what I'm doing wrong? Any help would be appreciated.


Back to: Top of message | Previous page | Main SAS-L page