|
Perhaps it works for you:
data have;
input name $2. volume $2. classification $2.;
cards;
X 1 S
X 2 F
X 1 S
Y 4 F
Y 3 A
Y 2 F
Y 5 A
Y 6 F
Y 7 F
Y 8 F
Z 10 S
Z 12 S
Z 11 S
Z 9 F
Z 13 F
;;;
proc sql noprint;
select distinct cats("'",name,"'") into: name_list separated by ','
from have
where upcase(classification)='S';
quit;
%put &name_list;
proc freq data=have (where=(name in (&name_list))) noprint;
tables name*classification /out= need sparse;
run;
Daniel Fernandez.
Barcelona,Spain.
-----Mensaje original-----
De: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] En nombre de Wing Wah Tham
Enviado el: lunes, 26 de abril de 2010 15:50
Para: SAS-L@LISTSERV.UGA.EDU
Asunto: extracting groups for a large dataset
Dear all,
I am trying to extract groups of certain characteristics. For example,
name volume classification
X 1 S
X 2 F
X 1 S
Y 4 F
Y 3 A
Y 2 F
Y 5 A
Y 6 F
Y 7 F
Y 8 F
Z 10 S
Z 12 S
Z 11 S
Z 9 F
Z 13 F
I am hoping to extract the data group X and Z because they contain 'S' in
their third column. I have a rather big data set of about 600 million
observations. I am wondering how I could do this efficiently. Below is the
desired output.
name volume classification
X 1 S
X 2 F
X 1 S
Z 10 S
Z 12 A
Z 11 S
Z 9 F
Z 13 F
I have tried to do this using proc summary using class to identify the
list of names containing the classification containing at least one 'S'
and then mearging the dataset. It is inefficient and I ran out of memory.
I am wondering if it would be more efficient to do it in hash?
Thanks in advance!
Wing Wah
|