Date: Wed, 1 Aug 2007 19:49:58 -0400
Reply-To: Kenneth Karan <posible88-sswug@YAHOO.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Kenneth Karan <posible88-sswug@YAHOO.COM>
Subject: nway unique counts
I already have a way to do this that uses macros, proc sort, and first. by
processing. This is horrendously inefficient. The dataset I need to use
this on is way too large. Can someone recommend an approach that would be
more efficient? Would indexes help? Data step hash objects?
I could load this into Oracle and use the "count(distinct(SSN))" functions
with the "group by cube" statement but I am trying to avoid involving IT.
Thanks in advance.
~ Ken
Given the dataset:
Obs State Gender Mon SSN
1 NY M Jan 123456789
2 NY M Feb 123456789
3 NY M Mar 123456789
4 NY M Jan 987654321
5 NY M Feb 987654321
6 FL M Mar 987654321
7 NY F Jan 234567890
8 NY F Feb 234567890
9 FL F Jan 345678901
I would like to produce the dataset:
Obs State Gender Mon _type_ People
1 0 4
2 Feb 1 3
3 Jan 1 4
4 Mar 1 2
5 F 2 2
6 M 2 2
7 F Feb 3 1
8 F Jan 3 2
9 M Feb 3 2
10 M Jan 3 2
11 M Mar 3 2
12 FL 4 2
13 NY 4 3
14 FL Jan 5 1
15 FL Mar 5 1
16 NY Feb 5 3
17 NY Jan 5 3
18 NY Mar 5 1
19 FL F 6 1
20 FL M 6 1
21 NY F 6 1
22 NY M 6 2
23 FL F Jan 7 1
24 FL M Mar 7 1
25 NY F Feb 7 1
26 NY F Jan 7 1
27 NY M Feb 7 2
28 NY M Jan 7 2
29 NY M Mar 7 1