| Date: | Wed, 28 Dec 2011 09:59:43 -0500 |
| Reply-To: | David Bourque <dbourque@LLBEAN.COM> |
| Sender: | "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU> |
| From: | David Bourque <dbourque@LLBEAN.COM> |
| Subject: | Re: Removing variables with all missing values |
|---|
I haven't seen a data step reply. Apologize if I missed one. Simplified
so that all vars are numeric.
options symbolgen;
data Have;
informat a b c b d e 8.;
input @1 a 1. b c d e f ;
infile cards missover;
cards;
3 7 11 . 2 3
. 25 3 . 2 3
. 12 . . 4 2
;
run;
data want (drop=i j acum bcum ccum dcum ecum fcum mac);
length mac $200;
retain acum bcum ccum dcum ecum fcum 0
mac 'drop ';
set have end = last ;
array twodim (12) a b c d e f acum bcum ccum dcum ecum fcum;
do i = 1 to 6;
if not missing (twodim{i}) then twodim{i+6} + 1;
end;
if last then do;
array two(12) a b c d e f acum bcum ccum dcum ecum fcum;
do j = 7 to 12;
if two{j} = 0 then mac = catx(' ', mac,vname(two{j-
6}));
call symput('dropvars', mac);
end;
end;
run;
data want;
set want;
&dropvars.;
proc print;
run;
On Fri, 16 Dec 2011 05:34:12 -0500, Ben Powell <bpowell555@GMAIL.COM>
wrote:
>I really like this reply. But if you want a bit more insight into what the
>variables are, which to keep, what's missing etc, then the following might
>help. Assuming a full read of each variable in sequence is not too much
>overhead,
>
>Regards.
>
>%let key1 = month;
>/* Key(s) to merge resultant datasets of highly populated
> and unpopulated variables */
>%let key2 = %trim();
>/* This is null but could be set to second (composite key) variable */
>
>%let import_dataset=x;
>
>
>data &import_dataset;
>set libx.fx;
>run;
>
>
>%macro vars_all_obs_missing(ds=);
>proc contents data = &ds out=_tmp noprint;run;
>
>proc sql noprint;
>select count(*)
>into :varscount
>from _tmp;
>quit;
>%put NOTE: Count of all vars is %trim(&varscount);
>
>data _tmp;
>set _tmp;
>id+1;
>run;
>
>/* Initialize meta table */
>data varsmiss;
>length var $256.;
>var="";
>miss=.;
>if _N_=0;
>run;
>
>%do i = 1 %to &varscount;
> proc sql noprint;
> select name
> into :var
> from _tmp
> where id=&i;
>
> select nmiss(&var)/* Number of missing function */
> into :varmiss
> from &ds;
> quit;
>
> data add;
> var="&var";
> miss=&varmiss;
> run;
>
> proc append base=varsmiss data=add FORCE;run;
>
>%end;
>%mend;
>
>
>%vars_all_obs_missing(ds=&import_dataset);
>
>
>%macro split_missing_vars(varsds=,key=);
>%global varsmain varsextra keyvar;
>%let varsmain=.;
>%let varsextra=.;
>%let keyvar=&key;
>proc sql noprint;
>select var
>into :varsmain separated by " "
>from &varsds
>where miss^=5;/* Change condition as applicable */
>
>select var
>into :varsextra separated by " "
>from &varsds
>where miss=5;/* Change condition as applicable */
>quit;
>
>%let varsextra = &key &varsextra;/* Key is assumed to be non-missing */
>%put NOTE: Main vars: &varsmain;
>%put;
>%put NOTE: Extra vars + key: &varsextra;
>%mend;
>
>
>%split_missing_vars(varsds=varsmiss,key=&key1 &key2);
>
>
>data main;
>retain &keyvar;
>retain &varsmain;
>set &import_dataset;
>keep &varsmain;
>run;
>
>data extra;
>retain &keyvar;
>retain &varsextra;
>set &import_dataset;
>keep &varsextra;
>run;
>
>
>
>On Wed, 14 Dec 2011 15:11:51 -0500, Nat Wooding <nathani@VERIZON.NET>
>wrote:
>
>>Kevin
>>
>>This question appears fairly regularly. The solution below is based on an
>>answer that Data _Null_ posted last summer.
>>
>>Nat Wooding
>>
>>data Have;
>>informat a b c $1. b d e 8.;
>>input @1 a 1. b c d e f ;
>>infile cards missover;
>>cards;
>>a b c 1 2 3
>> b c 2 3
>>
>> ;
>>Data Want;
>>set have;
>> if missing(coalesce(of _numeric_))
>> and missing(coalesceC(of _character_)) then delete;
>>run;
>>
>>
>>
>>-----Original Message-----
>>From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of
Kevin
>>F. Spratt
>>Sent: Wednesday, December 14, 2011 11:13 AM
>>To: SAS-L@LISTSERV.UGA.EDU
>>Subject: Removing variables with all missing values
>>
>>I am subsetting a SAS data and the nature of the sub setting is
>>likely to result in a number of variables that will have all
>>missing values. Some of these variables are numeric, some
>>character.
>>
>>I could run a proc freq on all variables, note which variables have
>>all missing values and drop them from the data set in a subsequent
>>run but wonder if there isn't a more elegant way of identifying
>>variables with all missing values (numeric or character) that would
>>save me this step and keep me from having to list all of the
>>variables I would need to drop.
>>
>>Running SAS 9.2 on 64-bit Windows XP Professional if that
>>makes any difference.
>>
>>Thanks in advance for the anticipated wisdom coming my way.
>>
>>
>>______________________________________________________________________
>>
>>Kevin F. Spratt, Ph.D.
>>Department of Orthopaedic Surgery
>>Dartmouth Medical School
>>One Medical Center Drive
>>DHMC
>>Lebanon, NH USA 03756
>>(603) 653-6012 (voice)
>>(603) 653-6013 (fax)
>>Kevin.F.Spratt@Dartmouth.Edu (e-mail)
>>
>>Data is not information;
>>Information is not knowledge;
>>Knowledge is not understanding;
>>Understanding is not wisdom.
>>
>> - Cliff Stoll and Gary Schubert
>>
>>_______________________________________________________________________
|