|
Dan,
I, too, am already brain dead for the night, but is the following what you
were trying to accomplish?:
data want;
set have;
array x[7] A--G;
do _n_ = 1 to &n_miss;
ndx = ceil(7*uniform(123));
do while(x[ndx] EQ .);
ndx = ceil(7*uniform(123));
end;
x[ndx] = .;
end;
x_mean = mean(of x(*));
do _n_ = 1 to 7;
if missing(x[_n_]) then x[_n_] = x_mean;
end;
run;
Art
---------
On Thu, 11 Nov 2010 18:40:38 -0800, Nordlund, Dan (DSHS/RDA)
<NordlDJ@DSHS.WA.GOV> wrote:
>> -----Original Message-----
>> From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of
>> Nordlund, Dan (DSHS/RDA)
>> Sent: Thursday, November 11, 2010 5:22 PM
>> To: SAS-L@LISTSERV.UGA.EDU
>> Subject: Re: creating missing randomly
>>
>> > -----Original Message-----
>> > From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of
>> > yk2k
>> > Sent: Thursday, November 11, 2010 4:13 PM
>> > To: SAS-L@LISTSERV.UGA.EDU
>> > Subject: creating missing randomly
>> >
>> > Hi, I'm trying to creat new datasets that contains missing vlaues
>> using
>> > exist data.
>> >
>> > If I have a data that contains 7 continuous variables like below.
>> >
>> > A B C D E F G
>> > 1 3 4 3 2 3 4
>> > 2 4 3 4 5 6 2
>> > 2 5 3 4 5 3 1
>> > 1 2 3 3 2 1 3
>> > 2 5 4 3 4 1 3
>> >
>> > I want to make missing randomly, but same number of missing within
>> each
>> > person like below.
>> >
>> > data-1 (1 missing per person)
>> >
>> > A B C D E F G
>> > 1 . 4 3 2 3 4
>> > 2 4 3 . 5 6 2
>> > 2 5 . 4 5 3 1
>> > . 2 3 3 2 1 3
>> > 2 5 4 3 4 . 3
>> >
>> > data-2 (2 missing per person)
>> >
>> > A B C D E F G
>> > 1 . 4 3 . 3 4
>> > 2 4 . . 5 6 2
>> > 2 5 . 4 5 . 1
>> > . 2 3 3 . 1 3
>> > 2 5 . 3 4 . 3
>> >
>> > ...up to 6 missings per person.
>> >
>> > Also, is there any way to replace the missing value with mean of rest
>> > of
>> > values?
>> >
>> > Thanks.
>>
>> Here is one way to do it:
>>
>> data have;
>> input A B C D E F G;
>> cards;
>> 1 3 4 3 2 3 4
>> 2 4 3 4 5 6 2
>> 2 5 3 4 5 3 1
>> 1 2 3 3 2 1 3
>> 2 5 4 3 4 1 3
>> ;
>> run;
>>
>> **----replace n_miss values with missing----**;
>> %let n_miss = 2;
>> data want1;
>> set have;
>> array x[7] A--G;
>> do _n_ = 1 to &n_miss;
>> ndx = ceil(7*uniform(123));
>> do while(x[ndx] EQ .);
>> ndx = ceil(7*uniform(123));
>> end;
>> x[ndx] = .;
>> end;
>> run;
>> proc print;
>> run;
>>
>> **----replace n_miss values with mean of remaining values----**;
>> data want2;
>> set have;
>> array x[7] A--G;
>> do _n_ = 1 to &n_miss;
>> ndx = ceil(7*uniform(123));
>> do while(x[ndx] EQ .);
>> ndx = ceil(7*uniform(123));
>> end;
>> x[ndx] = .;
>> end;
>> x_mean = mean(of A--G);
>> do _n_ = 1 to 7;
>> if missing(x[_n_]) then x[_n_] = x_mean;
>> end;
>> run;
>> proc print;
>> run;
>>
>> Hope this is helpful,
>>
>> Dan
>
>OK, this wasn't as helpful as I had planned. WANT1 was created as I
expected, with missing randomly inserted. However, WANT2 is not correct.
The wrong values are changed and the mean isn't always inserted (although
the initial missings are inserted randomly). If I create WANT3, where
instead of replacing by the mean, I replace with a constant (99 for
example), then the data step works as I expect. I am obviously brain dead
at this point. Can someone point out the error of my ways? Thanks.
>
>data have;
> input A B C D E F G;
>cards;
>1 3 4 3 2 3 4
>2 4 3 4 5 6 2
>2 5 3 4 5 3 1
>1 2 3 3 2 1 3
>2 5 4 3 4 1 3
>;
>run;
>
>**----replace n_miss values with missing----**;
>%let n_miss = 2;
>data want1;
> set have;
> array x[7] A--G;
> do _n_ = 1 to &n_miss;
> ndx = ceil(7*uniform(123));
> do while(x[ndx] EQ .);
> ndx = ceil(7*uniform(123));
> end;
> x[ndx] = .;
> end;
>run;
>proc print;
>run;
>
>**----replace n_miss values with mean of remaining values----**;
>data want2;
> set have;
> array x[7] A--G;
> do _n_ = 1 to &n_miss;
> ndx = ceil(7*uniform(123));
> do while(x[ndx] EQ .);
> ndx = ceil(7*uniform(123));
> end;
> x[ndx] = .;
> end;
> x_mean = mean(of A--G);
> do _n_ = 1 to 7;
> if missing(x[_n_]) then x[_n_] = x_mean;
> end;
>run;
>proc print;
>run;
>
>**----replace n_miss values with a constant (99)----**;
>data want2;
> set have;
> array x[7] A--G;
> do _n_ = 1 to &n_miss;
> ndx = ceil(7*uniform(123));
> do while(x[ndx] EQ .);
> ndx = ceil(7*uniform(123));
> end;
> x[ndx] = .;
> end;
> x_mean = mean(of A--G);
> do _n_ = 1 to 7;
> if missing(x[_n_]) then x[_n_] = 99;
> end;
>run;
>proc print;
>run;
>
>
>Dan
>
>Daniel J. Nordlund
>Washington State Department of Social and Health Services
>Planning, Performance, and Accountability
>Research and Data Analysis Division
>Olympia, WA 98504-5204
|