Date: Mon, 22 Nov 1999 17:51:19 -0800
Reply-To: "Berryhill, Tim" <TWB2@PGE.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "Berryhill, Tim" <TWB2@PGE.COM>
Subject: Re: multiple observation records
Content-Type: text/plain
"best" is not well defined, especially on SAS-L.
If the data is small (fewer than a million records, fewer than 100
variables) and CPU is free I would:
a) read it into SAS,
b) sort by user_id and session,
c) sort NODUPKEY by user_id and session into a new version (keep only
user_id and session),
d) set the new version by user_id, explicitly writing out if first.user_id
and not last.user_id,
e) merge the original and new versions by user_id, keeping every user_id in
the new version
If CPU were expensive, I would make a and d into views. That requires extra
testing, which requires the only resource I attempt to conserve (me).
> ----------
> From: diltilia@MY-DEJA.COM[SMTP:diltilia@MY-DEJA.COM]
> Reply To: diltilia@MY-DEJA.COM
> Sent: Monday, November 22, 1999 5:34 PM
> To: SAS-L@LISTSERV.UGA.EDU
> Subject: multiple observation records
>
> Hi,
>
> In the following data set, what's the best way to pull out the records
> that have multiple sessions (in this case, user_id 2 and 3)? I used
> several BY step to accomplish this, and figured that there's got to be
> an easier way??
> Please help. Many thanks
> -Diltilia
>
> data test;
> input user_id session sequence pagename $;
> cards;
> 1 11 29 a
> 1 11 34 b
> 1 11 45 c
> 1 11 50 d
> 2 22 13 a
> 2 22 16 b
> 2 221 55 c
> 2 221 67 b
> 2 221 80 a
> 2 221 90 c
> 3 331 12 a
> 3 331 15 b
> 3 332 1 c
> 3 332 3 b
> 3 332 5 a
> 3 332 9 b
> 3 333 4 b
> 3 333 5 c
> ;
>
>
>
>
> Sent via Deja.com http://www.deja.com/
> Before you buy.
>
|