Date: Sun, 17 Oct 2004 02:02:45 -0400
Reply-To: "Huang, Ya" <yhuang@AMYLIN.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "Huang, Ya" <yhuang@AMYLIN.COM>
Subject: Re: Cross Tabulation using SAS
On Sat, 16 Oct 2004 12:53:42 -0700, ravi <get_sravikumar@REDIFFMAIL.COM>
wrote:
>Hi,
>
>Please find below the sample data and the format of the output
>required
>
>Sample Data
>HH_ID CUST_ID ACCT_ID PROD_ID BALANCES
>101 3000 400001 DDA 4,545
>101 3000 400002 TDA 4,538
>101 3001 400003 MTG 90,034
>101 3002 400004 DDA 118,528
>101 3003 400005 HME 161,273
>102 3004 400006 TDA 204,017
>102 3005 400007 MECK 246,762
>102 3006 400008 MTG 289,506
>102 3007 400009 TDA 332,251
>103 3008 400010 XYA 374,995
>103 3008 400011 LMN 417,740
>103 3009 400012 XYA 460,484
>103 3010 400013 TDA 503,229
>103 3011 400014 MECK 545,973
>
>
>
>Objective : To find out how many HH_ID having 'MTG' PROD_ID
>Have other products. I am trying to do a cross tab with
>'MTG' against other products.
>
>Please find below the output format for the first two rows..
>
> MTG
>PROD_ID # HH_ID # CUST_ID # ACCT_ID Sum of Balances
>DDA 1 4 5 378,918
>TDA 2 8 9 1,451,453
>HME
>MECK
>XYA
>LMN
>
>Explanation:
>The value "1" in the first row second column is the number of distinct
>HH_ID which has both "MTG" and "DDA"
>Explanation : There is only one distinct "HH_ID" (101) which has both
>the PROD_ID "DDA" and "MTG"
>
>The value "4" in the first row and second column is the number of
>distinct CUST_ID which are present in the HH_ID which has both "MTG"
>and "TDA"
>Explanation : There are four distinct customers which belong to HH_ID
>"101" which has both MTG and DDA
>
>The value "5" in the first row and third column is the number of
>distinct ACCT_ID which are present in the HH_ID which has both "MTG"
>and "TDA"
>Explanation : There are five distinct ACCT_ID which belong to HH_ID
>"101" which has both "MTG" and "DDA" and next value (378,918) is the
>sum of balances
>in these accounts..
>
>
>Could you please let me know if there is a way to do this using SAS.
>
>Thanks a lot for your help
>Ravi
Ravi,
I think you have tow typo:
>The value "4" in the first row and second column is the number of
>distinct CUST_ID which are present in the HH_ID which has both "MTG"
>and "TDA"
Should it be 'DDA' ?
>Explanation : There are four distinct customers which belong to HH_ID
>"101" which has both MTG and DDA
>
>The value "5" in the first row and third column is the number of
>distinct ACCT_ID which are present in the HH_ID which has both "MTG"
>and "TDA"
Should it be "DDA' too?
Anyway, the follwing code generate the number match the number
you gave in the sample. Can you confirm the missing numbers in
your sample also match mine ?
Kind regards,
Ya Huang
------------
data xx;
input HH_ID $ CUST_ID $ ACCT_ID $ PROD_ID $ BALANCES : comma.;
cards;
101 3000 400001 DDA 4,545
101 3000 400002 TDA 4,538
101 3001 400003 MTG 90,034
101 3002 400004 DDA 118,528
101 3003 400005 HME 161,273
102 3004 400006 TDA 204,017
102 3005 400007 MECK 246,762
102 3006 400008 MTG 289,506
102 3007 400009 TDA 332,251
103 3008 400010 XYA 374,995
103 3008 400011 LMN 417,740
103 3009 400012 XYA 460,484
103 3010 400013 TDA 503,229
103 3011 400014 MECK 545,973
;
run;
proc sql;
create table xx as
select *, max(prod_id='MTG') as mtg
from xx
group by hh_id
;
create table y1 as
select a.prod_id,
count(distinct case when mtg then b.hh_id else '' end) as n1
from (select distinct prod_id from xx) a, xx b
where a.prod_id=b.prod_id and a.prod_id ne 'MTG'
group by a.prod_id
order by a.prod_id
;
create table y2 as
select distinct a.prod_id, count(distinct b.cust_id) as n2,
count(distinct b.acct_id) as n3
from xx a, xx b
where a.mtg and a.hh_id=b.hh_id and a.prod_id ne 'MTG'
group by a.prod_id
order by a.prod_id
;
create table y3 as
select distinct a.prod_id, b.acct_id, b.balances
from xx a, xx b
where a.mtg and a.hh_id=b.hh_id and a.prod_id ne 'MTG'
group by a.prod_id
order by a.prod_id
;
create table y3 as
select distinct prod_id,sum(balances) as n4
from y3
group by prod_id
order by prod_id
;
data final;
merge y1 y2 y3;
by prod_id;
run;
options missing='0';
proc print;
format n4 comma9.;
run;
Obs PROD_ID n1 n2 n3 n4
1 DDA 1 4 5 378,918
2 HME 1 4 5 378,918
3 LMN 0 0 0 0
4 MECK 1 4 4 1,072,536
5 TDA 2 8 9 1,451,454
6 XYA 0 0 0 0
|