LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (August 2007, week 1)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Wed, 1 Aug 2007 20:12:31 -0400
Reply-To:     Sigurd Hermansen <HERMANS1@WESTAT.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Sigurd Hermansen <HERMANS1@WESTAT.COM>
Subject:      Re: nway unique counts
Comments: To: Kenneth Karan <posible88-sswug@YAHOO.COM>
In-Reply-To:  <200708012349.l71L1473005865@mailgw.cc.uga.edu>
Content-Type: text/plain; charset="us-ascii"

Kenneth: Check out the NWAY option of PROC SUMMARY or PROC MEANS. It limits the summary to distinct sets of values of the variables. The DISTINCT keyword in a SQL SELECT statement works much the same way if you add a COUNT(*) as n and a GROUP BY clause to that lists the same class of variables as in the SELECT (minus the COUNT(*)). S

-----Original Message----- From: owner-sas-l@listserv.uga.edu [mailto:owner-sas-l@listserv.uga.edu] On Behalf Of Kenneth Karan Sent: Wednesday, August 01, 2007 7:50 PM To: SAS-L@LISTSERV.UGA.EDU Cc: Kenneth Karan Subject: nway unique counts

I already have a way to do this that uses macros, proc sort, and first. by processing. This is horrendously inefficient. The dataset I need to use this on is way too large. Can someone recommend an approach that would be more efficient? Would indexes help? Data step hash objects?

I could load this into Oracle and use the "count(distinct(SSN))" functions with the "group by cube" statement but I am trying to avoid involving IT.

Thanks in advance.

~ Ken

Given the dataset:

Obs State Gender Mon SSN

1 NY M Jan 123456789 2 NY M Feb 123456789 3 NY M Mar 123456789 4 NY M Jan 987654321 5 NY M Feb 987654321 6 FL M Mar 987654321 7 NY F Jan 234567890 8 NY F Feb 234567890 9 FL F Jan 345678901

I would like to produce the dataset:

Obs State Gender Mon _type_ People

1 0 4 2 Feb 1 3 3 Jan 1 4 4 Mar 1 2 5 F 2 2 6 M 2 2 7 F Feb 3 1 8 F Jan 3 2 9 M Feb 3 2 10 M Jan 3 2 11 M Mar 3 2 12 FL 4 2 13 NY 4 3 14 FL Jan 5 1 15 FL Mar 5 1 16 NY Feb 5 3 17 NY Jan 5 3 18 NY Mar 5 1 19 FL F 6 1 20 FL M 6 1 21 NY F 6 1 22 NY M 6 2 23 FL F Jan 7 1 24 FL M Mar 7 1 25 NY F Feb 7 1 26 NY F Jan 7 1 27 NY M Feb 7 2 28 NY M Jan 7 2 29 NY M Mar 7 1


Back to: Top of message | Previous page | Main SAS-L page