LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (September 2007, week 1)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Tue, 4 Sep 2007 14:43:42 -0000
Reply-To:     sassql@GMAIL.COM
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         sassql@GMAIL.COM
Organization: http://groups.google.com
Subject:      Re: LOCF question still not resolved
Comments: To: sas-l@uga.edu
In-Reply-To:  <2fc7f3340709030959w7778c0ccu41e2dabd2e89c5d1@mail.gmail.com>
Content-Type: text/plain; charset="us-ascii"

On Sep 3, 12:59 pm, muthia.kachira...@GMAIL.COM (Muthia Kachirayan) wrote: > On 9/1/07, sas...@gmail.com <sas...@gmail.com> wrote: > > > > > > > > > Dear all, > > > Sorry to bother you guys again. But i have still LOCF issue > > unresolved. Let me describe the situation again. I really appreciate > > your help and time. Thanks again. > > > Hi, > > > Actually i have a data with around 100+ variables in it. And i need to > > make sure that there should be atleast 6 visits per patient. So if in > > case a patient is missing any visit, then i need to create a visit > > which is missing and then carry over the data from the previous visit > > for all the variables except few variables. I have two variables in > > the dataset named flag and ontreatment. So if a patient is missing any > > visit, i have to create a record for the missing visit, and carry over > > all the data from the previous visit for most of the variables, only > > where flag = 1 and ontreatment = 1. > > Example: > > > Patient visit FLAG ontreatment ecogscore OC_LOCF > > tumormeasurements > > investi > > 101 > > -1 3 OC > > 20 ABC > > 101 0 > > 2 OC > > 20 ABC > > 101 1 1 1 4 > > OC 30 ABC > > 101 1 1 > > 3 OC > > 30 ABC > > 101 2 1 1 4 > > LOCF . ABC > > 101 3 1 1 2 > > OC 34 ABC > > 101 4 1 1 2 > > LOCF . ABC > > 101 5 1 1 2 > > LOCF . ABC > > 101 6 1 1 2 > > LOCF . ABC > > > So there are character and numeric variables both in the dataset whose > > value needs to be carry over. In the above example, for patient 101, > > its missing visit 2, 4 5 and6. So i have to carry over data from the > > visit 1 for visit 2 where flag = 1and ontreatment = 1 for all the > > variables except the tumormeasurements. For visit 4, 5 and 6, carry > > over the data from the visit 3 where flag = 1 and ontreatment = 1 for > > all variables except the tumor measurements. That's the reason the > > tumor measurements values are missing for the LOCF records. > > > I would really appreciate if you can let me know how i can implement > > the above LOCF. Just want to remind again that there are more than 100 > > variables in the dataset and they are both character and numeric. > > > Thanks in advance. > > This is a special LOCF problem, a CONDITIONAL LOCF. Unlike regular LOCF, the > LAG() function can not be used. We can use arrays to store and retrieve rows > when a given CONDTION is met. But in view of 100 + variables it may require > more care. > > The ARRAY and POINT= option is very elegant and straight forward for this > problem. > > The code is given below. It is not tested on a large data set. > > The array, k[], is used to save the Record Number when VISIT(vis) is present > and a missing value when it is not for each of PATIENT(pt) . With this > information, the POINTER= option is used to directly access the ROW from the > data set. To meet your last requirement, recPtr, is used to indicate when a > ROW meets the condition , FLAG = 1 and ONTREATMENT(ot) = 1. > > The mixture of numeric and character variables can be handled by passing > numVars and strVars as macro variables separately. The number of visits can > be varied through the use of lastVisit and passing it as macro variable. > > The use of LENGTH statement is used to impact the PDV to keep them in that > order for OUTPUT. The 100 plus variables can be listed here at this > statement or passed as a macro variable. The variables that do not take part > in LOCF, the unused, can be kept in a separate list and passed to the > program. The miss2Record-code need be taken care of for these unused > variables. > > %let numVars = vis flag ot tum eco; > %let strVars = oc_locf inv; > %let lastVisit = 6; > data a; > ** impact the PDV; > length pt vis flag ot 8 oc_locf $8 tum 8 inv $8 eco 8; > array numv[*] &numVars; > array strv[*] $ &strVars; > array k[-1:&lastVisit] _temporary_; > ** Fill k[] with Record ID when VIS is present; > do _n_ = 1 by 1 until(last.pt); > set given end = eof; > by pt; > k[vis] = _n_; > end; > > do m = -1 to &lastVisit; > if k[m] then do; > p = nRecs + k[m]; > link ptr2Record; > output; > if (flag > 0) * (ot > 0) then recPtr = p; > end; > else if k[m] = . then do; > if recPtr = . then do; > link miss2Record; > vis = m; > oc_locf = ' '; > output; > end; > else do; > p = recPtr; > link ptr2Record; > vis = m; > oc_locf = 'LOCF'; > output; > end; > end; > end; > nRecs ++ _n_; > link initArray; > return; > > ptr2Record: > set given point = p; > return; > > miss2Record: > do ii = 1 to dim(numv); > numv[ii] = .; > end; > do ii = 1 to dim(strv); > strv[ii] = ' '; > end; > return; > > initArray: > do ii = 1 to dim(numv); > numv[ii] = .; > end; > do ii = 1 to dim(strv); > strv[ii] = ' '; > end; > do ii = -1 to &lastVisit; > k[ii] = .; > end; > return; > > if eof then stop; > drop m nRecs ii recPtr; > run; > > This was tested on this sample data: > > data given; > input pt vis flag ot oc_locf $ tum inv $ eco; > cards; > 101 -1 . . oc 20 abc 1 > 101 0 . . oc 20 abc 1 > 101 1 1 1 oc 20 abc 1 > 101 3 . 1 oc 20 abc 1 > 102 -1 . . oc 20 xyz 2 > 102 0 1 . oc 20 xyz 2 > 102 2 1 1 oc 20 xyz 2 > 102 5 1 1 oc 20 xyz 2 > ; > run; > > The data set is sorted by PT and VIS before using the program. > > proc sort data = given; > by pt vis; > run; > > The output is: > > Obs pt vis flag ot oc_locf tum inv eco > > 1 101 -1 . . oc 20 abc 1 > > 2 101 0 . . oc 20 abc 1 > > 3 101 1 1 1 oc 20 abc 1 > > 4 101 2 1 1 LOCF 20 abc 1 > > 5 101 3 . 1 oc 20 abc 1 > > 6 101 4 1 1 LOCF 20 abc 1 > > 7 101 5 1 1 LOCF 20 abc 1 > > 8 101 6 1 1 LOCF 20 abc 1 > > 9 102 -1 . . oc 20 xyz 2 > > 10 102 0 1 . oc 20 xyz 2 > > 11 102 1 . . . . > > 12 102 2 1 1 oc 20 xyz 2 > > 13 102 3 1 1 LOCF 20 xyz 2 > > 14 102 4 1 1 LOCF 20 xyz 2 > > 15 102 5 1 1 oc 20 xyz 2 > > 16 102 6 1 1 LOCF 20 xyz 2 > > The LINK statements, ptr2Record, miss2Record and initArray, keep the > program neat for understanding the flow as well as for re-use. ptr2Record > fetches a ROW for a given POINTER(P). miss2Record fills a ROW with missing > values. initArray initializes the arrays at the end of BY-Group processing > to handle the next PATIENT. Though initArray is called once, it is not a > perfect candidate for LINK statement but for the understanding, it is given > as separate. > > I would appreciate your feedback when you use this with a very large data > set. > > Regards, > > Muthia Kachirayan- Hide quoted text - > > - Show quoted text -

Dear,

I really appreciate all your time and effort in doing this. I will try this and let you know if i have any questions. Regards,


Back to: Top of message | Previous page | Main SAS-L page