LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (June 2010, week 5)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Tue, 29 Jun 2010 13:19:22 -0400
Reply-To:     "Lamias, Mark (CDC/OID/OD) (CTR)" <bnz6@CDC.GOV>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         "Lamias, Mark (CDC/OID/OD) (CTR)" <bnz6@CDC.GOV>
Subject:      Re: how to output partial data with bigger "n" freqs
Comments: To: Adriano Rodrigues <adriano@GPP.COM.BR>
In-Reply-To:  <004001cb177c$72974190$57c5c4b0$@com.br>
Content-Type: text/plain; charset=iso-8859-1

Maybe I am misunderstanding your example, but according to what you are requesting, Joe, Richard, and Peter should all have values of "Yes" for the top20inregion variable, since there are only three vendors in this region (any region where the number of vendors less than or equal to 20 would have a value of "Yes" as they would be in the top 20 vendors as well as the bottom 20 vendors). This contradicts your desired output below. Are you just looking for the top vendor in each of perhaps 20 (or more) regions? If so this could be handled by a first.vendorname and first.regionn after obtaining counts for each vendor in region using proc sql. For example:

data work.temp; input Region $ Vendorname $ amount carname $; datalines; North Richard 20000 Honda North Joe 30000 Ford North Peter 35000 Ford North Joe 30000 Ford North Richard 20000 Honda North Peter 31000 Ford North Joe 30000 Ford North Richard 20000 Honda North Peter 35000 Ford North Joe 25000 Ford North Richard 20000 Honda North Peter 23000 Ford North Joe 30000 Ford North Richard 20000 Honda North Zico 35000 Ford North Joe 30000 Ford North Peter 35000 Ford North Richard 20000 Honda North Peter 31000 Ford North Joe 30000 Ford North Joe 25000 Ford North Zico 20000 Honda North Richard 20000 Honda North Peter 23000 Ford South Joe 30000 Ford South Peter 35000 Ford South Richard 20000 Honda South Peter 31000 Ford South Joe 30000 Ford South Joe 25000 Ford South Zico 20000 Honda South Richard 20000 Honda South Peter 23000 Ford ; run;

proc sql; create table work.temp2(DROP=n) as select count(*) as n, * from work.temp group by region, vendorname; quit;

data work.temp2; set work.temp2; by region vendorname; if first.vendorname and first.region then top20inregion="Yes"; else top20inregion="No"; run;

Mark J. Lamias SAIC Statistical Consultant & SAIC Team Lead Office of the Director -- Office of Informatics National Center for Emerging and Zoonotic Infectious Diseases (NCEZID) US Centers for Disease Control and Prevention w: (404) 498-2646 m: (404) 543-1394 f: (404) 639-1391

-----Original Message----- From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of Adriano Rodrigues Sent: Tuesday, June 29, 2010 7:16 AM To: SAS-L@LISTSERV.UGA.EDU Subject: RES: how to output partial data with bigger "n" freqs

Sorry,

Maybe i not explain very well. Each vendor appear many times on my list in each region. I want export dataset with all observations, for this top 20 vendors, my data will not have 20 lines, but 500, because each one sell a lot.

Previous: The dataset has 3 regions and i want output the top 20 vendors (number of occurences, not amount total) by region, to work only with them. How to do it?

Region vendorname amount carname North Joe 30000 Ford North Richard 20000 Honda North Peter 35000 Ford North Joe 30000 Ford North Richard 20000 Honda North Peter 31000 Ford North Joe 30000 Ford North Richard 20000 Honda North Peter 35000 Ford North Joe 25000 Ford North Richard 20000 Honda North Peter 23000 Ford North Joe 30000 Ford North Richard 20000 Honda North Zico 35000 Ford North Joe 30000 Ford North Richard 20000 Honda North Peter 31000 Ford North Joe 30000 Ford North Zico 20000 Honda North Peter 35000 Ford North Joe 25000 Ford North Richard 20000 Honda North Peter 23000 Ford ....

Same for region South and Others. What i need is like this: Region vendorname amount carname top20inregion North Joe 30000 Ford Yes North Richard 20000 Honda No North Peter 35000 Ford No North Joe 30000 Ford Yes North Richard 20000 Honda No North Peter 31000 Ford No North Joe 30000 Ford Yes North Richard 20000 Honda No North Peter 35000 Ford No North Joe 25000 Ford Yes .... ....

Where this last variable "top20inregion" is based on number of sells (from proc freq, for example)

-----Mensagem original----- De: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] Em nome de Daniel Fernández Enviada em: segunda-feira, 28 de junho de 2010 15:09 Para: SAS-L@LISTSERV.UGA.EDU Assunto: Re: how to output partial data with bigger "n" freqs

hi,

why don´t you do, after proc freq, a proc sort: proc sort data= freq_output out= need (obs=200); by descending count; run;

You will get the bigest 200 groups.

HTH

Daniel Fernández.

'In God we trust. All others must bring data'.(Anónimo)

--- El lun, 28/6/10, goladin@gmail.com <goladin@GMAIL.COM> escribió:

De: goladin@gmail.com <goladin@GMAIL.COM> Asunto: Re: how to output partial data with bigger "n" freqs Para: SAS-L@LISTSERV.UGA.EDU Fecha: lunes, 28 de junio, 2010 15:18

Hi,

I think you can use proc sql to achieve your needs.

Regards, Murphy Sent from my Nokia phone -----Original Message----- From: Adriano Rodrigues Sent: 28/06/2010 9:12:24 PM Subject: how to output partial data with bigger "n" freqs

Hi,

I have one dataset with 4 variables region,vendorname,amount,carname.

The dataset has 3 regions and i want output the top 20 vendors (number of occurences, not amount total) by region, to work only with them. How to do it?

Basically i need make one new variable "top20 in region?" with Yes/no. But i really dont know how to do that in faster way.

Poor and fast solution by now is proc freq, get names and make one IF name by name...but IF i need top 200 would be very bad...

Thanks in advance, Adriano


Back to: Top of message | Previous page | Main SAS-L page