Date: Fri, 12 Jan 2007 10:58:22 -0500
Reply-To: "Wainwright, Andrea" <andrea.wainwright@CAPITALONE.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "Wainwright, Andrea" <andrea.wainwright@CAPITALONE.COM>
Subject: Re: Data step question
In-Reply-To: A<200701121518.l0CBkeua015172@mailgw.cc.uga.edu>
Content-Type: text/plain; charset="us-ascii"
This is what I love about SAS-L.
Three replies, all different, yet correct.
I have to admit that what I had in mind was basically what Toby put, but
I found the other solutions interesting too.
I tend toward data step manipulation, but proc means and proc summary
would work too.
-----Original Message-----
From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of
Venky Chakravarthy
Sent: Friday, January 12, 2007 10:18 AM
To: SAS-L@LISTSERV.UGA.EDU
Subject: Re: Data step question
On Fri, 12 Jan 2007 09:38:51 -0500, Peter Flom <Flom@NDRI.ORG> wrote:
>I have a data set that has c. 100 observations on 13 variables.
>The observations are cities, and the variables are estimates of a
>number for each of 13 years (1990 to 2003)
>
>No city has an observation for every year.
>Some cities have no observations for ANY year.
>Some cities, for some years, have MULTIPLE observations. In this case,
>there are multiple observations in the data set, with the same city
>name and different numbers for the variables
>
>Here's the beginning of output from a PRINT
>
>Ob MSA H1990 H1991 H1992 H1993 H1994 H1995 H1996
>
>1 0080 . . . . .
>. .
>2 0160 . . . . .
>. .
>3 0200 . . . . .
>. 0.50
>4 0240 . . . . .
>. .
>5 0440 . . . . .
>. .
>6 0520 . . . . .
>. .
>7 0640 . . . . .
>. .
>8 0680 . . . . .
>. .
>9 0720 . . . . 29.1
>13 3.68
>10 0720 . . . . 28.0
> . .
>11 0720 20.5 17.6 . . .
>14 29.10
>>>>
>
>I have more years and more MSAs (cities) but this illustrates the
>structure
>
>What I'd like
>
>1) If there are no observations for an MSA in any year, delete it
>2) If there are multiple observations for an MSA in a given year,
>average them
>
>and save that
>
>All help, as always, appreciated
>
>Peter
Peter,
If I understand your problem correctly, I think we can do this with a
PROC MEANS.
data test ;
input MSA H1990 H1991 H1992 H1993 H1994 H1995 H1996 ;
cards ;
0080 . . . . . . .
0160 . . . . . . .
0200 . . . . . . 0.50
0240 . . . . . . .
0440 . . . . . . .
0520 . . . . . . .
0640 . . . . . . .
0680 . . . . . . .
0720 . . . . 29.1 13 3.68
0720 . . . . 28.0 . .
0720 20.5 17.6 . . . 14 29.10
run ;
proc means noprint data = test nway ;
class msa ;
var H1990 H1991 H1992 H1993 H1994 H1995 H1996 ;
output out = averaged ( where = (sum ( H1990, H1991
,H1992, H1993, H1994
,H1995 ,H1996) ^=. )
drop = _freq_ _type_ )
mean = H1990 H1991 H1992 H1993 H1994 H1995 H1996 ; run ;
options nocenter ;
proc print ;
run ;
Obs MSA H1990 H1991 H1992 H1993 H1994 H1995
H1996
1 200 . . . . . .
0.50
2 720 20.5 17.6 . . 28.55 13.5
16.39
Venky
The information contained in this e-mail is confidential and/or proprietary
to Capital One and/or its affiliates. The information transmitted herewith
is intended only for use by the individual or entity to which it is
addressed. If the reader of this message is not the intended recipient,
you are hereby notified that any review, retransmission, dissemination,
distribution, copying or other use of, or taking of any action in reliance
upon this information is strictly prohibited. If you have received this
communication in error, please contact the sender and delete the material
from your computer.