LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous (more recent) messageNext (less recent) messagePrevious (more recent) in topicNext (less recent) in topicPrevious (more recent) by same authorNext (less recent) by same authorPrevious page (June 2007, week 4)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Wed, 27 Jun 2007 17:32:45 -0400
Reply-To:     "Howard Schreier <hs AT dc-sug DOT org>" <nospam@HOWLES.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         "Howard Schreier <hs AT dc-sug DOT org>" <nospam@HOWLES.COM>
Subject:      Re: Creating a unique dataset

On Mon, 25 Jun 2007 17:46:13 +0530, N, Raghupathi <raghupathi.n@GENPACT.COM> wrote:

> > >Hi all, > > > >I have a data set which consists of Patient_id, quarter and >corresponding events. > > > >Patient_id > >Quarter > >Event_1 > >Event_2 > >Event_3 > >Event_4 > >Event_5 > >21 > >1 > >1

[snip]

>0 > > > > > >My problem is I have to recreate a dataset which need patient per >quarter per events ie. All the events come under one row. > > > >Patient_id > >Quarter > >Event_1 > >Event_2 > >Event_3 > >Event_4 > >Event_5 > >21 > >1 > >1 > >1.2 > >1.6 or 2.8 > >2.4 > >1.5 or 3 >

[snip]

> >3 > >2.5 or 1.1 > >1 > > > >Here or implies any one of the event not (both or sum or division or >aggregate or subtraction) > > > >Thanks in advance, > > > >Raghupathi

First, take a look at your post http://tinyurl.com/36lb3u and notice that the example is a mess, with all headings and values folded into a single column. I suspect that is why the question has received no attention.

A better presentation of the given data set:

data have; input

ID Q E_1 E_2 E_3 E_4 E_5 ; cards; 21 1 1 0 2.8 2.4 3 21 1 0 1.2 1.6 0 1.5 21 2 0 2.5 0 0 1.6 32 1 1 0 3 2.5 1 32 1 0 2.4 0 1.1 0

;

What is needed:

ID Q E_1 E_2 E_3 E_4 E_5

21 1 1 1.2 1.6 or 2.8 2.4 1.5 or 3 21 2 0 2.5 0 0 1.6 32 1 1 2.4 3 2.5 or 1.1 1

The task can be broken into steps. First, generate line numbers so that the observations can be referenced individually:

data numbered; num + 1; set have; run;

Then, normalize:

proc transpose data=numbered out=long; by num id q; var e_ : ; run;

Now sort:

proc sort data=long out=ordered(drop = num); by id q _name_ num; run;

The sort order presumes that the values connected by the word "or" should appesr in the order of their original appearance. Another possibility is to have them appear in order of magnitude by using COL! as the most minor sort key. The example has instances of each, so I'm not sure.

Now combine values which must share a cell:

data rollup(drop = col1); do until (last._name_); set ordered; by id q _name_; length rollup $ 12; if col1 then rollup = catx ( ' or ' , rollup , col1 ); end; rollup = coalescec ( rollup , '0' ); run;

Finally, de-normalize back to a wide structure>

proc transpose data=rollup out=need(drop = _name_); by id q; var rollup; run;

Output:

ID Q E_1 E_2 E_3 E_4 E_5

21 1 1 1.2 2.8 or 1.6 2.4 3 or 1.5 21 2 0 2.5 0 0 1.6 32 1 1 2.4 3 2.5 or 1.1 1


Back to: Top of message | Previous page | Main SAS-L page