Date: Fri, 9 Jul 2010 09:38:19 -0400
Reply-To: msz03@albany.edu
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Mike Zdeb <msz03@ALBANY.EDU>
Subject: Re: by Group trick,merge data
Content-Type: text/plain;charset=iso-8859-1
hi ...just saw this and thought I'd offer another idea ...
by the way ... if a group has only one member (as in one formed with a
CLASS or BY statement), PROC MEANS (and SUMMARY) return a MISSING value
for the MODE, but PROC UNIVARIATE returns a real value
shouldn't they be consistent ???
*****************************************
data x;
input datasource brand : $10. quarter : $6. sale;
datalines;
1 toyota 1977q1 15
1 toyota 1977q2 25
1 toyota 1977q3 35
1 ford 1977q1 45
1 ford 1977q2 55
1 ford 1977q3 65
2 toyota 1977q1 15
2 toyota 1977q2 25
2 toyota 1977q3 99
3 toyota 1977q3 99
;
run;
* compute mode and mean for each brand and quarter;
proc summary data=x nway;
class brand quarter;
var sale;
output out=modes (drop=_:) mode=mode mean=mean;
run;
* compare mode to each sale within brand/quarter;
proc sort data=x;
by brand quarter;
run;
data y;
do until (last.quarter);
merge x modes;
by brand quarter;
if mode eq . then mode = mean;
different = sum(different,(sale ne mode));
end;
conflict_flag = (different gt 0);
keep conflict_flag brand quarter mode;
run;
proc print data=y;
run;
--
Mike Zdeb
U@Albany School of Public Health
One University Place (Room 119)
Rensselaer, New York 12144-3456
P/518-402-6479 F/630-604-1475
On Jul 7, 5:08 pm, hewei2004 <hewei2...@gmail.com> wrote:
> Hi,
>
> Suppose that I have the following file
>
> datasource brand quarter sale
> 1 toyota 1977q1 15
> 1 toyota 1977q2 25
> 1 toyota 1977q3 35
> 1 ford 1977q1 45
> 1 ford 1977q2 55
> 1 ford 1977q3 65
> 2 toyota 1977q1 15
> 2 toyota 1977q2 25
> 2 toyota 1977q3 99
> 3 toyota 1977q3 99
>
> I want to create a file that merges the information in datasource=1
> and datasource=2 and datasource=3, and create a variable called
> conflict_flag when datasource1 or datasource2 or datasource3
> disagrees.
>
> note that datasource=1 agrees with datasource=2 agrees on 1977q1 and
> 1977q2, thus conflict_flag=0
> note that datasource=1 agrees with datasource=2 disagrees on 1977q3,
> thus conflict_flag=1
>
> The desired output is:
> conflict_flag brand quarter sale
> 0 toyota 1977q1 15
> 0 toyota 1977q2 25
> 1 toyota 1977q3 [mode of (35, 99, 99 ), which is 99 of
> course]
> 0 ford 1977q1 45
> 0 ford 1977q2 55
> 0 ford 1977q3 65