LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous (more recent) messageNext (less recent) messagePrevious (more recent) in topicNext (less recent) in topicPrevious (more recent) by same authorNext (less recent) by same authorPrevious page (April 2010, week 4)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Mon, 26 Apr 2010 16:08:30 +0200
Reply-To:   Fernández Rodríguez, Dani <DFernandez@CST.CAT>
Sender:   "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:   Fernández Rodríguez, Dani <DFernandez@CST.CAT>
Subject:   Re: extracting groups for a large dataset
Comments:   To: Wing Wah Tham <wingwahtham@YAHOO.CO.UK>
In-Reply-To:   A<201004261349.o3QAlQnJ027254@malibu.cc.uga.edu>
Content-Type:   text/plain; charset="iso-8859-1"

Perhaps it works for you:

data have; input name $2. volume $2. classification $2.; cards; X 1 S X 2 F X 1 S Y 4 F Y 3 A Y 2 F Y 5 A Y 6 F Y 7 F Y 8 F Z 10 S Z 12 S Z 11 S Z 9 F Z 13 F ;;;

proc sql noprint; select distinct cats("'",name,"'") into: name_list separated by ',' from have where upcase(classification)='S'; quit; %put &name_list;

proc freq data=have (where=(name in (&name_list))) noprint; tables name*classification /out= need sparse; run;

Daniel Fernandez. Barcelona,Spain.

-----Mensaje original----- De: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] En nombre de Wing Wah Tham Enviado el: lunes, 26 de abril de 2010 15:50 Para: SAS-L@LISTSERV.UGA.EDU Asunto: extracting groups for a large dataset

Dear all,

I am trying to extract groups of certain characteristics. For example,

name volume classification X 1 S X 2 F X 1 S Y 4 F Y 3 A Y 2 F Y 5 A Y 6 F Y 7 F Y 8 F Z 10 S Z 12 S Z 11 S Z 9 F Z 13 F

I am hoping to extract the data group X and Z because they contain 'S' in their third column. I have a rather big data set of about 600 million observations. I am wondering how I could do this efficiently. Below is the desired output.

name volume classification X 1 S X 2 F X 1 S Z 10 S Z 12 A Z 11 S Z 9 F Z 13 F

I have tried to do this using proc summary using class to identify the list of names containing the classification containing at least one 'S' and then mearging the dataset. It is inefficient and I ran out of memory. I am wondering if it would be more efficient to do it in hash?

Thanks in advance!

Wing Wah


Back to: Top of message | Previous page | Main SAS-L page