Date: Mon, 31 Mar 2008 04:55:13 -0400
Reply-To: Gerhard Hellriegel <gerhard.hellriegel@T-ONLINE.DE>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Gerhard Hellriegel <gerhard.hellriegel@T-ONLINE.DE>
Subject: Re: newbie: Missing values
There are two important problems by your calculation of means:
1. a sum of several variables is missing, if one of them is missing:
a=., b=1, c=2
s = (a+b+c)/3; is missing!
You could avoid that by using the sum-function:
s= sum(a,b,c)/3;
is not missing, but maybe not what you wanted! The result is 1, maybe 1.5
is that what you expected (=2.)
In that case you should use the MEAN-function:
m = mean(a,b,c);
By the way: PROC MEANS (or SUMMERY) is good to calculate "vertically"
statistics. If you need them "horizontally" (=for each obs, not for the
dataset) you cannot use the PROCs.
Gerhard
On Mon, 31 Mar 2008 10:50:54 +0530, ajay ohri <ohri2007@GMAIL.COM> wrote:
>create a flag for impact of missing...
>
>if height eq . or weight eq . then flag=1 else flag=0;
>
>
>Us if condition in data step to make the changes--
>
>if height eq . then delete ; (for deleting the value)
>
>
>
>also
>
>use proc means for finding the mean and median height, weight.
>Use proc means with and without the flag condition to check the value of
>missing values.
>
>in addition you can try and replace the missing values with mean or
median.
>This depends on the assumption that both these measures (height, weight )
>are distributed normally.
>
>if height eq . then height =60 (for replacing with mean height ,here 60 is
>mean height)
>similarly for weight....
>
>HTP
>
>Ajay
>
>
>
>On Mon, Mar 31, 2008 at 6:52 AM, Sigurd Hermansen <HERMANS1@westat.com>
>wrote:
>
>> Test for a missing height value in the denominator and set the value of
>> BMI to missing (.) if so. You should not ignore the participants with
>> missing height values until you know that the missing values occur
>> rarely at largely at random.
>> S
>>
>> -----Original Message-----
>> From: owner-sas-l@listserv.uga.edu [mailto:owner-sas-l@listserv.uga.edu]
>> On Behalf Of lp
>> Sent: Sunday, March 30, 2008 8:46 PM
>> To: SAS-L@LISTSERV.UGA.EDU
>> Subject: newbie: Missing values
>>
>>
>> Hi,
>> Sorry, if this is a basic question but, I have a SAS data set and have
>> weight and height as my variables and am attempting to calculate BMI,
>> however, there are missing values on some of the cells (indicated as '.'
>> and '0'). How could I program SAS to ignore the participants with the
>> missing values and calculate the rest of the participants' BMI? Below is
>> my SAS editor and log. Thank you for your help.
>>
>> LIBNAME THESIS 'G:\THESIS\THESIS DATA';
>>
>> DATA HASAAWTHESIS;
>> SET THESIS.HSAW_HBP(WHERE = (PERIOD =1 AND DIABETES =1));
>> BMI = ((WEIGHT/(HEIGHT**2))*703);
>> AVSBP = ((HSBP1 + HSBP2 + SBP)/3);
>> AVDBP = ((HDBP1 + HDBP2 + DBP)/3);
>> IF AVSBP < 140 OR AVDBP < 90 THEN HTN = 1;
>> IF AVSBP > = 140 OR AVDBP > = 90 THEN HTN = 2;
>> RUN;
>> QUIT;
>>
>>
>>
>> 67 LIBNAME THESIS 'G:\THESIS\THESIS DATA';
>> NOTE: Libname THESIS refers to the same physical library as TMP1.
>> NOTE: Libref THESIS was successfully assigned as follows:
>> Engine: V9
>> Physical Name: G:\Thesis\Thesis data
>> 68
>> 69 DATA HASAAWTHESIS;
>> 70 SET THESIS.HSAW_HBP(WHERE = (PERIOD =1 AND DIABETES =1));
>> 71 BMI = ((WEIGHT/(HEIGHT**2))*703);
>> 72 AVSBP = ((HSBP1 + HSBP2 + SBP)/3);
>> 73 AVDBP = ((HDBP1 + HDBP2 + DBP)/3);
>> 74 IF AVSBP < 140 OR AVDBP < 90 THEN HTN = 1;
>> 75 IF AVSBP > = 140 OR AVDBP > = 90 THEN HTN = 2;
>> 76 RUN;
>>
>> NOTE: Division by zero detected at line 71 column 15. SUBJID=HASAAW197
>> AGE=55 WEIGHT=0 HEIGHT=0 TOT_FAT1=58.9 TOT_FAT2=90.5 TOT_FAT3=23
>> SMOKER=1 TRICEPTS=0 SUBSCAPULA=40 WAIST_GR1=194.7 WAIST_GR2=92
>> HIP_GR1=114 HIP_GR2=115 HIP_GR3=110 LATERAL=94 MID_THIGH=62 HSBP1=164
>> HDBP1=86 HSBP2=164 HDBP2=80 sbp1=. dbp1=. sbp2=. dbp2=. ESTR_THER=2
>> AGE_ESTR=. YRS_ESTR=20 CUR_ESTR=2 STP_ESTR=. MED_BLPR=2 MENOPAUSE=.
>> HORMONE=2 HOTFLASH=2 i=6 period=1 day_last=. age_first=11 stop_per=35
>> yr_last=75 insulin=1 diabetes=1 pace=3 marital_st=1 employed=1
>> educ_you=2 M_HBP=2 F_HBP=9 sib_hbp=0 ch_hbp=0 HIGH_BL=2 HIGHBL_YR=91
>> DIAB_YR=. birth_date=10/05/1938 SMOKE_EVER=1 SMOK_CURR=. cig_num=.
>> YEARS_SMOK=. PAST_DRINK=1 HOW_OFTEN=. STREN_EXER=1 EXER_3XWEE=1 HT_FT=.
>> HT_IN=. SBP=0 DBP=0 NORMAL=. WAIST=0 WAIST_INCH=0 HIPS=0 HIPS_INCH=0
>> BODY_FAT=0 BFAT_LBS=0 T_BODYFAT=0 FEM=2 AFRO_AMER=2 FIFTY_80=2
>> DATE_BIRTH=10/05/1938 BMI=. AVSBP=109.33333333 AVDBP=55.333333333 HTN=1
>> _ERROR_=1 _N_=135
>> NOTE: Missing values were generated as a result of performing an
>> operation on missing values.
>> Each place is given by: (Number of times) at (Line):(Column).
>> 10 at 71:15 10 at 71:23 1 at 72:17 79 at 72:25 1 at 73:17
>> 79 at 73:25
>> NOTE: Mathematical operations could not be performed at the following
>> places. The results of
>> the operations have been set to missing values.
>> Each place is given by: (Number of times) at (Line):(Column).
>> 8 at 71:15
>> NOTE: There were 352 observations read from the data set
>> THESIS.HSAW_HBP.
>> WHERE (PERIOD=1) and (DIABETES=1);
>> NOTE: The data set WORK.HASAAWTHESIS has 352 observations and 82
>> variables.
>> NOTE: DATA statement used (Total process time):
>> real time 0.12 seconds
>> cpu time 0.01 seconds
>>
>>
>> 77 QUIT;
>>
|