| Date: | Wed, 26 Nov 1997 23:10:41 -0500 |
| Reply-To: | Aaron Dukes <bsd01@SPRYNET.COM> |
| Sender: | "SAS(r) Discussion" <SAS-L@UGA.CC.UGA.EDU> |
| From: | Aaron Dukes <bsd01@SPRYNET.COM> |
| Organization: | Breakthru Systems Development, Inc. |
| Subject: | Summary: class: N distinct ? |
|---|
I would like to do the equivalent of running a Proc Summary with about 10
CLASS variables, getting totals of my VAR variable at every dimension
combination, and also the number of distinct levels of each class
variable at each class-var combination.
proc summary data=orig;
class c b a;
var traffic;
output out=cube sum=traffic(rename=(_type_=uniqclas))
ndistinct=na nb nc;
Unfortunately there is no such NDISTINCT statistic.
One can then do the following:
proc sql;
create table na as
select *, count(distinct(a)) as na from cube(where=(uniqclas=1))
union
select *, count(distinct(a)) as na from cube(where=(uniqclas=3))
group by b
union
select *, count(distinct(a)) as na from cube(where=(uniqclas=5))
group by c
union
select *, count(distinct(a)) as na from cube(where=(uniqclas=7))
group by b,c
;
create table nb as
select *, count(distinct(b)) as nb from cube(where=(uniqclas=2))
union
select *, count(distinct(b)) as nb from cube(where=(uniqclas=3))
group by a
union
select *, count(distinct(b)) as nb from cube(where=(uniqclas=6))
group by c
union
select *, count(distinct(b)) as nb from cube(where=(uniqclas=7))
group by a,c
;
create table nc as
select *, count(distinct(c)) as nc from cube(where=(uniqclas=4))
union
select *, count(distinct(c)) as nc from cube(where=(uniqclas=5))
group by a
union
select *, count(distinct(c)) as nc from cube(where=(uniqclas=6))
group by b
union
select *, count(distinct(c)) as nc from cube(where=(uniqclas=7))
group by a,b
;
quit;
One can then merge these back to cube after appropriately revising
uniqclas. However, my problem with this aproach is that with 10 class
vars, some of which have many (say 100-200) distinct values, while I'm
able to do the Proc Summary within my resource limits, the SQL takes a
very long time.
Is there a better, faster way ?
Thanks!
/===========================================\
| Aaron L. Dukes
| Consultant Statistician and SAS Developer
|
| Breakthru Systems Development, Inc.
|
| bsd01@sprynet.com
\===========================================/
|