LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous (more recent) messageNext (less recent) messagePrevious (more recent) in topicNext (less recent) in topicPrevious (more recent) by same authorNext (less recent) by same authorPrevious page (February 2009, week 1)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Mon, 2 Feb 2009 17:26:24 -0600
Reply-To:     Joe Matise <snoopy369@GMAIL.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Joe Matise <snoopy369@GMAIL.COM>
Subject:      Re: Help with summarizing data
In-Reply-To:  <200902022314.n12LPq01023822@malibu.cc.uga.edu>
Content-Type: text/plain; charset=ISO-8859-1

Ah, and here I was proud of myself that I knew how to do a group summary function without looking it up ... :) Never knew that UNION dropped nonunique rows, obviously something very important to know. Thanks Howard and Akshaya!

-Joe

On Mon, Feb 2, 2009 at 5:14 PM, Howard Schreier <hs AT dc-sug DOT org> < schreier.junk.mail@gmail.com> wrote:

> On Mon, 2 Feb 2009 15:24:22 -0600, Joe Matise <snoopy369@GMAIL.COM> wrote: > > >Just for reference, if you wanted to use SQL this would do it (and matches > >PROC SUMMARY results): > > > > > >proc sql; > >create table allqs_sum_sql as > > select sum(a), sum(b), sum(c) from > > ((select * from q1) > >union > > (select * from q2) > >union > > (select * from q3) > >union > > (select * from q4)) > > group by state; > >quit; > > > >The 'group by' makes sure that each state is summed properly. It does not > >give you the counts as neatly as proc summary/means would, though you > could > >ask for count(a) I think if you want that information. > > > >-Joe > > Include STATE in the SELECT. > > There are also a couple of subtle issues which could cause incorrect > results > with no other symptoms. (1) If it happens that a state has the same values > for A, B, and C in two quarters, the UNION operator will discard one of > those rows; the ALL option can prevent this. (2) If the columns are not in > the same order in all of the tables, the sums will not properly segregate > the values; use the CORRESPONDING option to prevent that. > > > > >On Mon, Feb 2, 2009 at 3:22 PM, Kathleen Santos <flowerpotcat@gmail.com > >wrote: > > > >> Yeah, I kinda hit send before I finished typing. > >> Thanks. That's a really good suggestion. > >> > >> On Mon, Feb 2, 2009 at 3:19 PM, Joe Matise <snoopy369@gmail.com> wrote: > >> > >>> I think the PROC SQL got missed. > >>> > >>> If you set them together: > >>> > >>> data allQs; > >>> set q1 q2 q3 q4; > >>> run; > >>> > >>> perhaps with a quarter identifier if that is interesting to you > >>> > >>> and then use PROC SUMMARY or PROC MEANS, with state as the class, it > will > >>> take into account only the present records and not count the missing > records > >>> against it. > >>> > >>> For example: > >>> data q1; > >>> q=1; > >>> do state = 1 to 51; > >>> a = 100*round(ranuni(0),0.001); > >>> b = 100*round(ranuni(0),0.001); > >>> c = 100*round(ranuni(0),0.001); > >>> output; > >>> end; > >>> run; > >>> > >>> data q2; > >>> q=2; > >>> do state = 1 to 51; > >>> a = 100*round(ranuni(0),0.001); > >>> b = 100*round(ranuni(0),0.001); > >>> c = 100*round(ranuni(0),0.001); > >>> output; > >>> end; > >>> run; > >>> > >>> data q3; > >>> q=3; > >>> do state = 1 to 50; > >>> a = 100*round(ranuni(0),0.001); > >>> b = 100*round(ranuni(0),0.001); > >>> c = 100*round(ranuni(0),0.001); > >>> output; > >>> end; > >>> run; > >>> data q4; > >>> q=4; > >>> do state = 13 to 29; > >>> a = 100*round(ranuni(0),0.001); > >>> b = 100*round(ranuni(0),0.001); > >>> c = 100*round(ranuni(0),0.001); > >>> output; > >>> end; > >>> run; > >>> > >>> data allQs; > >>> set q1 q2 q3 q4; > >>> run; > >>> > >>> proc summary data=allqs noprint nway; > >>> class state; > >>> var a b c; > >>> output out=allqs_sum > >>> sum(a)=sum_a sum(b)=sum_b sum(c)=sum_c; > >>> run; > >>> > >>> > >>> and then allqs_sum has the sum for each state over each variable. If > you > >>> have more than a few variables, look into procedurally generating code > with > >>> PROC SQL select into :, or use the : operator and /autoname. > >>> > >>> -Joe > >>> > >>> On Mon, Feb 2, 2009 at 3:04 PM, Kathleen Santos < > flowerpotcat@gmail.com > >>> > wrote: > >>> > >>>> Hi, > >>>> > >>>> I need help summarizing my data. > >>>> > >>>> I have four data sets,say Q1, Q2, Q3, and Q4. These four data sets > have > >>>> the > >>>> same variables, say A B and C. The cases are states, so one record per > >>>> state. However, not all states are present in all four. In Q1 and Q2, > the > >>>> 51 > >>>> states are present. In Q3, 50 states are present, but in Q4, only 17 > >>>> states > >>>> are in the data set. > >>>> > >>>> What I would like to do is to sum A (then B, then C) across all > datasets > >>>> per > >>>> state. So I will have this kind of data set: > >>>> > >>>> State New variable > >>>> State1 A_q1+A_q2+A_q3+A_q4 > >>>> State2 A_q1+A_q2+A_q3+A_q4 > >>>> | > >>>> State17 A_q1+A_q2+A_q3+A_q4 > >>>> State18 A_q1+A_q2+A_q3 > >>>> | > >>>> State50 A_q1+A_q2+A_q3 > >>>> State51 A_q1+A_q2 > >>>> > >>>> I tried to do this using PROC SQL using this code: > >>>> > >>>> > >>>> * > >>>> > >>>> * > >>>> > >>> > >>> > >> >


Back to: Top of message | Previous page | Main SAS-L page