LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (November 2011, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Tue, 8 Nov 2011 23:34:04 -0500
Reply-To:     R B <ryan.andrew.black@GMAIL.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         R B <ryan.andrew.black@GMAIL.COM>
Subject:      Re: Array Question
Comments: To: Fareeza Khurshed <fkhurshed@gmail.com>
In-Reply-To:  <CAL9q9Trw9yCsxoM=6WktRtOcc_KfJ9uZZZbiWorNrKU3DGdyJQ@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1

Works like a charm. Quite efficient. Thank you!

On Tue, Nov 8, 2011 at 10:58 PM, Fareeza Khurshed <fkhurshed@gmail.com>wrote:

> If a SQL solution will work the following does it in 'one step'. > > > *data* have; > > input id x; > > cards; > > 1 1 > > 1 0 > > 1 1 > > 2 2 > > 2 1 > > 2 1 > > 3 4 > > 3 4 > > 3 0 > > ; > > *run*; > > > *proc* *sql*; > > create table want as > > select id, x, sum(x)**0.5* as z > > from have > > group by id; > > *quit*; > > > > Cheers, > > Fareeza > > > > On Tue, Nov 8, 2011 at 7:42 PM, R B <ryan.andrew.black@gmail.com> wrote: > >> Sorry, Joe, but I must not be explaining myself well tonight. Your code: >> >> data want; >> set have; >> z=x*0.5; >> run; >> >> does not sum the x values associated with a particular id. Suppose we have >> 3 ids which repeat 3 times as follows: >> >> id x >> 1 1 >> 1 0 >> 1 1 >> 2 2 >> 2 1 >> 2 1 >> 3 4 >> 3 4 >> 3 0 >> >> If we sum the x-values associated with each id and then multiply by 0.5, >> we >> would have the following: >> >> z_id1 = (1 + 0 + 1)*.50 = 1 >> z_id2 = (2 + 1 + 1)*.50 = 2 >> z_id3 = (4 + 4+ 0)*.50 = 4 >> >> and the resulting dataset would be: >> >> id x z >> 1 1 1 >> 1 0 1 >> 1 1 1 >> 2 2 2 >> 2 1 2 >> 2 1 2 >> 3 4 4 >> 3 4 4 >> 3 0 4 >> >> Hope this clears it up. I have a 100 ids, so I'm looking for an efficient >> approach. >> >> Does this clear up my question? >> >> Ryan >> >> On Tue, Nov 8, 2011 at 10:28 PM, Joe Matise <snoopy369@gmail.com> wrote: >> >> > Does that constant vary by ID, or is it the same? If it's the same, >> > then it's just >> > >> > data want; >> > set have; >> > z=x*0.5; >> > run; >> > >> > If it varies by ID, there are a number of ways to do it. Most easily >> > I think is to have a separate dataset of factors: >> > >> > >> > data factors; >> > input id factor; >> > datalines; >> > 1 0.5 >> > 2 0.3 >> > 3 0.4 >> > ;;;; >> > run; >> > >> > data want; >> > merge have(in=a) factors(in=b); >> > by id; >> > if a and b; >> > z = x*factor; >> > run; >> > >> > That would do it (requires HAVE to be sorted by ID). Also could use a >> > custom format, a hash table, or a number of other solutions. If you >> > must put it in code, which is not generally recommended, you could use >> > a SELECT statement to organize it more clearly. >> > >> > -Joe >> > >> > On Tue, Nov 8, 2011 at 9:22 PM, R B <ryan.andrew.black@gmail.com> >> wrote: >> > > Thank you for responding. Let me try to clarify. >> > > >> > > I'd like the x values associated with each id to be summed and then >> > > multiplied by a constant (e.g., .50). In the resulting dataset, the z >> > value >> > > should be the same for cases associated with id=1, the z value should >> be >> > > the same for cases associated with id=2, and so forth. >> > > >> > > To be clear, I want all cases to remain in the dataset. >> > > >> > > Thank you. >> > > >> > > Ryan >> > > >> > > >> > > On Tue, Nov 8, 2011 at 10:14 PM, Joe Matise <snoopy369@gmail.com> >> wrote: >> > > >> > >> What do you actually want (in english, not equations)? Final >> dataset, >> > >> what should it actually contain? Are you trying to sum the X value >> > >> for each id and get a dataset with 1 row per ID with the sum of the X >> > >> values (or 1/2 of the x values or whatever)? >> > >> >> > >> If that's what you are trying to do, then this should work: >> > >> >> > >> proc sort data=have; >> > >> by id; >> > >> run; >> > >> >> > >> data want; >> > >> set have; >> > >> by id; >> > >> if first.id then z=0; >> > >> z+(0.5*x); >> > >> if last.id then output; >> > >> keep id z; >> > >> run; >> > >> >> > >> If something else you will need to explain. You could also use a >> proc >> > >> (PROC TABULATE, PROC MEANS/SUMMARY, etc.) to do this, although you'd >> > >> have to do the 1/2 bit separately. >> > >> >> > >> -Joe >> > >> >> > >> On Tue, Nov 8, 2011 at 8:48 PM, R B <ryan.andrew.black@gmail.com> >> > wrote: >> > >> > Dear SAS-L, >> > >> > >> > >> > Here's an illustration of my dataset: >> > >> > >> > >> > id x >> > >> > 1 3 >> > >> > 2 4 >> > >> > 3 5 >> > >> > 1 2 >> > >> > 2 4 >> > >> > 3 9 >> > >> > 1 3 >> > >> > 2 1 >> > >> > 3 2 >> > >> > >> > >> > I'd like to apply the following equation in a data step: >> > >> > >> > >> > data test; >> > >> > set test; >> > >> > if id = 1 then z = sum(x)*.50; >> > >> > if id = 2 then z = sum(x)*.50; >> > >> > if id = 3 then z = sum(x)*.50; >> > >> > run; >> > >> > >> > >> > I actually have 100 ids, and I would like to create an array (or >> > >> whatever's >> > >> > most efficient) to apply the formula above in a data step for each >> id. >> > >> > >> > >> > Any help would be appreciated. >> > >> > >> > >> > Thanks, >> > >> > >> > >> > Ryan >> > >> > >> > >> >> > > >> > >> > >


Back to: Top of message | Previous page | Main SAS-L page