LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous (more recent) messageNext (less recent) messagePrevious (more recent) in topicNext (less recent) in topicPrevious (more recent) by same authorNext (less recent) by same authorPrevious page (November 2010, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Thu, 11 Nov 2010 23:48:07 -0500
Reply-To:   Arthur Tabachneck <art297@ROGERS.COM>
Sender:   "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:   Arthur Tabachneck <art297@ROGERS.COM>
Subject:   Re: creating missing randomly
Comments:   To: Daniel Nordlund <NordlDJ@DSHS.WA.GOV>

Dan,

I, too, am already brain dead for the night, but is the following what you were trying to accomplish?:

data want; set have; array x[7] A--G; do _n_ = 1 to &n_miss; ndx = ceil(7*uniform(123)); do while(x[ndx] EQ .); ndx = ceil(7*uniform(123)); end; x[ndx] = .; end; x_mean = mean(of x(*)); do _n_ = 1 to 7; if missing(x[_n_]) then x[_n_] = x_mean; end; run;

Art --------- On Thu, 11 Nov 2010 18:40:38 -0800, Nordlund, Dan (DSHS/RDA) <NordlDJ@DSHS.WA.GOV> wrote:

>> -----Original Message----- >> From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of >> Nordlund, Dan (DSHS/RDA) >> Sent: Thursday, November 11, 2010 5:22 PM >> To: SAS-L@LISTSERV.UGA.EDU >> Subject: Re: creating missing randomly >> >> > -----Original Message----- >> > From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of >> > yk2k >> > Sent: Thursday, November 11, 2010 4:13 PM >> > To: SAS-L@LISTSERV.UGA.EDU >> > Subject: creating missing randomly >> > >> > Hi, I'm trying to creat new datasets that contains missing vlaues >> using >> > exist data. >> > >> > If I have a data that contains 7 continuous variables like below. >> > >> > A B C D E F G >> > 1 3 4 3 2 3 4 >> > 2 4 3 4 5 6 2 >> > 2 5 3 4 5 3 1 >> > 1 2 3 3 2 1 3 >> > 2 5 4 3 4 1 3 >> > >> > I want to make missing randomly, but same number of missing within >> each >> > person like below. >> > >> > data-1 (1 missing per person) >> > >> > A B C D E F G >> > 1 . 4 3 2 3 4 >> > 2 4 3 . 5 6 2 >> > 2 5 . 4 5 3 1 >> > . 2 3 3 2 1 3 >> > 2 5 4 3 4 . 3 >> > >> > data-2 (2 missing per person) >> > >> > A B C D E F G >> > 1 . 4 3 . 3 4 >> > 2 4 . . 5 6 2 >> > 2 5 . 4 5 . 1 >> > . 2 3 3 . 1 3 >> > 2 5 . 3 4 . 3 >> > >> > ...up to 6 missings per person. >> > >> > Also, is there any way to replace the missing value with mean of rest >> > of >> > values? >> > >> > Thanks. >> >> Here is one way to do it: >> >> data have; >> input A B C D E F G; >> cards; >> 1 3 4 3 2 3 4 >> 2 4 3 4 5 6 2 >> 2 5 3 4 5 3 1 >> 1 2 3 3 2 1 3 >> 2 5 4 3 4 1 3 >> ; >> run; >> >> **----replace n_miss values with missing----**; >> %let n_miss = 2; >> data want1; >> set have; >> array x[7] A--G; >> do _n_ = 1 to &n_miss; >> ndx = ceil(7*uniform(123)); >> do while(x[ndx] EQ .); >> ndx = ceil(7*uniform(123)); >> end; >> x[ndx] = .; >> end; >> run; >> proc print; >> run; >> >> **----replace n_miss values with mean of remaining values----**; >> data want2; >> set have; >> array x[7] A--G; >> do _n_ = 1 to &n_miss; >> ndx = ceil(7*uniform(123)); >> do while(x[ndx] EQ .); >> ndx = ceil(7*uniform(123)); >> end; >> x[ndx] = .; >> end; >> x_mean = mean(of A--G); >> do _n_ = 1 to 7; >> if missing(x[_n_]) then x[_n_] = x_mean; >> end; >> run; >> proc print; >> run; >> >> Hope this is helpful, >> >> Dan > >OK, this wasn't as helpful as I had planned. WANT1 was created as I expected, with missing randomly inserted. However, WANT2 is not correct. The wrong values are changed and the mean isn't always inserted (although the initial missings are inserted randomly). If I create WANT3, where instead of replacing by the mean, I replace with a constant (99 for example), then the data step works as I expect. I am obviously brain dead at this point. Can someone point out the error of my ways? Thanks. > >data have; > input A B C D E F G; >cards; >1 3 4 3 2 3 4 >2 4 3 4 5 6 2 >2 5 3 4 5 3 1 >1 2 3 3 2 1 3 >2 5 4 3 4 1 3 >; >run; > >**----replace n_miss values with missing----**; >%let n_miss = 2; >data want1; > set have; > array x[7] A--G; > do _n_ = 1 to &n_miss; > ndx = ceil(7*uniform(123)); > do while(x[ndx] EQ .); > ndx = ceil(7*uniform(123)); > end; > x[ndx] = .; > end; >run; >proc print; >run; > >**----replace n_miss values with mean of remaining values----**; >data want2; > set have; > array x[7] A--G; > do _n_ = 1 to &n_miss; > ndx = ceil(7*uniform(123)); > do while(x[ndx] EQ .); > ndx = ceil(7*uniform(123)); > end; > x[ndx] = .; > end; > x_mean = mean(of A--G); > do _n_ = 1 to 7; > if missing(x[_n_]) then x[_n_] = x_mean; > end; >run; >proc print; >run; > >**----replace n_miss values with a constant (99)----**; >data want2; > set have; > array x[7] A--G; > do _n_ = 1 to &n_miss; > ndx = ceil(7*uniform(123)); > do while(x[ndx] EQ .); > ndx = ceil(7*uniform(123)); > end; > x[ndx] = .; > end; > x_mean = mean(of A--G); > do _n_ = 1 to 7; > if missing(x[_n_]) then x[_n_] = 99; > end; >run; >proc print; >run; > > >Dan > >Daniel J. Nordlund >Washington State Department of Social and Health Services >Planning, Performance, and Accountability >Research and Data Analysis Division >Olympia, WA 98504-5204


Back to: Top of message | Previous page | Main SAS-L page