Date: Mon, 14 Apr 2003 16:03:04 -0700
Reply-To: "Huang, Ya" <yhuang@AMYLIN.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "Huang, Ya" <yhuang@AMYLIN.COM>
Subject: Re: eliminated variables with missing values
Content-Type: text/plain; charset="iso-8859-1"
This might be less efficient, but it is quite simple in coding and
easy to understand:
data xx;
input ID date $ var1 var2 var3;
cards;
1 Jan1 . 1 5
1 Jan3 1 . 5
2 Jan1 . . .
2 Jan5 . . .
3 Dec15 2 1 4
3 Jan5 . . .
4 Feb3 . 2 3
4 Feb8 3 7 .
;
proc transpose data=xx out=yy;
by id date;
var var1-var3;
run;
proc sql;
create table yy as
select *
from yy
group by _name_, id
having count(distinct case when col1 ne . then date else ' ' end) >=2
order by id, _name_, date
;
proc transpose data=yy out=xx (drop=_name_);
by id date;
id _name_;
run;
options nocenter;
proc print;
run;
-------------
Obs ID date var3 var2
1 1 Jan1 5 .
2 1 Jan3 5 .
3 4 Feb3 . 2
4 4 Feb8 . 7
Kind regards,
Ya Huang
-----Original Message-----
From: William Kossack [mailto:kossackw@NJC.ORG]
Sent: Monday, April 14, 2003 2:06 PM
To: SAS-L@LISTSERV.UGA.EDU
Subject: Re: eliminated variables with missing values
Actually the more tedious problem I'm trying to eliminate is the removal
of variables.
an example might be as below except my dataset has hundreds of variables
ID date var1 var2 var3
1 Jan1 . 1 5
1 Jan3 1 . 5
2 Jan1 . . .
2 Jan5 . . .
3 Dec15 2 1 4
3 Jan5 . . .
In this example I want to keep var3 because it was collected for ID 1 on 2
different dates.
I also want to keep only ID 1.
I don't want to keep var1 or var2 or ID 2 or ID 3.
My analysis requires information on more than one date ie if they were not
able to go back and get a second reading then it is worthless to me.
William Kossack wrote:
> I have a dataset that is mostly missing data. The data is organized by
> ID and date.
>
> I need to remove variables from the dataset that do not have two values
> by sort order.
>
> If sorted by id and then date and I don't have two dates with values for
> an id I want to eliminate the variable from the dataset.
> Similarly, if an id does not have two dates with values I want to
> eliminate the id.
>
> Ids with only one value on one date are eliminated. Variables with only
> missing values or only one value per id have been elimiated.