Date: Wed, 9 Nov 2011 07:50:46 -0500
Reply-To: "Bian, Haikuo" <HBian@FLQIO.SDPS.ORG>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "Bian, Haikuo" <HBian@FLQIO.SDPS.ORG>
Subject: Re: Array Question
In-Reply-To: <CAMMWLD2yumyLr0LDDsgZ3ZxWBt3A+LBt=SsNWip060nhVqrELg@mail.gmail.com>
Content-Type: text/plain; charset="us-ascii"
Datastep can also does it in 'one step':)
data have;
infile cards;
input id x;
cards;
1 1
1 0
1 1
2 2
2 1
2 1
3 4
3 4
3 0
;
data want;
do until (last.id);
set have;
by id;
if first.id then z=0;
z+x*.5;
end;
do until (last.id);
set have;
by id;
output;
end;
drop x;
run;
proc print;run;
Regards,
Haikuo
-----Original Message-----
From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of R B
Sent: Tuesday, November 08, 2011 11:34 PM
To: SAS-L@LISTSERV.UGA.EDU
Subject: Re: Array Question
Works like a charm. Quite efficient. Thank you!
On Tue, Nov 8, 2011 at 10:58 PM, Fareeza Khurshed <fkhurshed@gmail.com>wrote:
> If a SQL solution will work the following does it in 'one step'.
>
>
> *data* have;
>
> input id x;
>
> cards;
>
> 1 1
>
> 1 0
>
> 1 1
>
> 2 2
>
> 2 1
>
> 2 1
>
> 3 4
>
> 3 4
>
> 3 0
>
> ;
>
> *run*;
>
>
> *proc* *sql*;
>
> create table want as
>
> select id, x, sum(x)**0.5* as z
>
> from have
>
> group by id;
>
> *quit*;
>
>
>
> Cheers,
>
> Fareeza
>
>
>
> On Tue, Nov 8, 2011 at 7:42 PM, R B <ryan.andrew.black@gmail.com> wrote:
>
>> Sorry, Joe, but I must not be explaining myself well tonight. Your code:
>>
>> data want;
>> set have;
>> z=x*0.5;
>> run;
>>
>> does not sum the x values associated with a particular id. Suppose we have
>> 3 ids which repeat 3 times as follows:
>>
>> id x
>> 1 1
>> 1 0
>> 1 1
>> 2 2
>> 2 1
>> 2 1
>> 3 4
>> 3 4
>> 3 0
>>
>> If we sum the x-values associated with each id and then multiply by 0.5,
>> we
>> would have the following:
>>
>> z_id1 = (1 + 0 + 1)*.50 = 1
>> z_id2 = (2 + 1 + 1)*.50 = 2
>> z_id3 = (4 + 4+ 0)*.50 = 4
>>
>> and the resulting dataset would be:
>>
>> id x z
>> 1 1 1
>> 1 0 1
>> 1 1 1
>> 2 2 2
>> 2 1 2
>> 2 1 2
>> 3 4 4
>> 3 4 4
>> 3 0 4
>>
>> Hope this clears it up. I have a 100 ids, so I'm looking for an efficient
>> approach.
>>
>> Does this clear up my question?
>>
>> Ryan
>>
>> On Tue, Nov 8, 2011 at 10:28 PM, Joe Matise <snoopy369@gmail.com> wrote:
>>
>> > Does that constant vary by ID, or is it the same? If it's the same,
>> > then it's just
>> >
>> > data want;
>> > set have;
>> > z=x*0.5;
>> > run;
>> >
>> > If it varies by ID, there are a number of ways to do it. Most easily
>> > I think is to have a separate dataset of factors:
>> >
>> >
>> > data factors;
>> > input id factor;
>> > datalines;
>> > 1 0.5
>> > 2 0.3
>> > 3 0.4
>> > ;;;;
>> > run;
>> >
>> > data want;
>> > merge have(in=a) factors(in=b);
>> > by id;
>> > if a and b;
>> > z = x*factor;
>> > run;
>> >
>> > That would do it (requires HAVE to be sorted by ID). Also could use a
>> > custom format, a hash table, or a number of other solutions. If you
>> > must put it in code, which is not generally recommended, you could use
>> > a SELECT statement to organize it more clearly.
>> >
>> > -Joe
>> >
>> > On Tue, Nov 8, 2011 at 9:22 PM, R B <ryan.andrew.black@gmail.com>
>> wrote:
>> > > Thank you for responding. Let me try to clarify.
>> > >
>> > > I'd like the x values associated with each id to be summed and then
>> > > multiplied by a constant (e.g., .50). In the resulting dataset, the z
>> > value
>> > > should be the same for cases associated with id=1, the z value should
>> be
>> > > the same for cases associated with id=2, and so forth.
>> > >
>> > > To be clear, I want all cases to remain in the dataset.
>> > >
>> > > Thank you.
>> > >
>> > > Ryan
>> > >
>> > >
>> > > On Tue, Nov 8, 2011 at 10:14 PM, Joe Matise <snoopy369@gmail.com>
>> wrote:
>> > >
>> > >> What do you actually want (in english, not equations)? Final
>> dataset,
>> > >> what should it actually contain? Are you trying to sum the X value
>> > >> for each id and get a dataset with 1 row per ID with the sum of the X
>> > >> values (or 1/2 of the x values or whatever)?
>> > >>
>> > >> If that's what you are trying to do, then this should work:
>> > >>
>> > >> proc sort data=have;
>> > >> by id;
>> > >> run;
>> > >>
>> > >> data want;
>> > >> set have;
>> > >> by id;
>> > >> if first.id then z=0;
>> > >> z+(0.5*x);
>> > >> if last.id then output;
>> > >> keep id z;
>> > >> run;
>> > >>
>> > >> If something else you will need to explain. You could also use a
>> proc
>> > >> (PROC TABULATE, PROC MEANS/SUMMARY, etc.) to do this, although you'd
>> > >> have to do the 1/2 bit separately.
>> > >>
>> > >> -Joe
>> > >>
>> > >> On Tue, Nov 8, 2011 at 8:48 PM, R B <ryan.andrew.black@gmail.com>
>> > wrote:
>> > >> > Dear SAS-L,
>> > >> >
>> > >> > Here's an illustration of my dataset:
>> > >> >
>> > >> > id x
>> > >> > 1 3
>> > >> > 2 4
>> > >> > 3 5
>> > >> > 1 2
>> > >> > 2 4
>> > >> > 3 9
>> > >> > 1 3
>> > >> > 2 1
>> > >> > 3 2
>> > >> >
>> > >> > I'd like to apply the following equation in a data step:
>> > >> >
>> > >> > data test;
>> > >> > set test;
>> > >> > if id = 1 then z = sum(x)*.50;
>> > >> > if id = 2 then z = sum(x)*.50;
>> > >> > if id = 3 then z = sum(x)*.50;
>> > >> > run;
>> > >> >
>> > >> > I actually have 100 ids, and I would like to create an array (or
>> > >> whatever's
>> > >> > most efficient) to apply the formula above in a data step for each
>> id.
>> > >> >
>> > >> > Any help would be appreciated.
>> > >> >
>> > >> > Thanks,
>> > >> >
>> > >> > Ryan
>> > >> >
>> > >>
>> > >
>> >
>>
>
>
-----------------------------------------
Email messages cannot be guaranteed to be secure or error-free as
transmitted information can be intercepted, corrupted, lost,
destroyed, arrive late or incomplete, or contain viruses. The
Centers for Medicare & Medicaid Services therefore does not accept
liability for any error or omissions in the contents of this
message, which arise as a result of email transmission.
CONFIDENTIALITY NOTICE: This communication, including any
attachments, may contain confidential information and is intended
only for the individual or entity to which it is addressed. Any
review, dissemination, or copying of this communication by anyone
other than the intended recipient is strictly prohibited. If you
are not the intended recipient, please contact the sender by reply
email and delete and destroy all copies of the original message.
|