LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous (more recent) messageNext (less recent) messagePrevious (more recent) in topicNext (less recent) in topicPrevious (more recent) by same authorNext (less recent) by same authorPrevious page (January 2008, week 3)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Thu, 17 Jan 2008 15:55:46 -0600
Reply-To:     Mary <mlhoward@avalon.net>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Mary <mlhoward@AVALON.NET>
Subject:      Re: Splitting up a variable
Comments: To: "J. Andrel" <jocelyn.andrel@MAIL.JCI.TJU.EDU>
Content-Type: text/plain; charset="iso-8859-1"

One idea I have is to use two files as output with the first file containing your id an illness, and the second file containing your id and illness date, like this

file 'c:\file1.txt'; put id, ';', illness;

file 'c:\file2.txt'; put id, ';',illnessdate;

Then have a program that reads them in separately as delimited files (using the semicolon as you indicated as the delimiter) data set1; /* delimited file info */

input id illness1-illness120;

and in a different data set:

data set2; /* delimted file info */

input id illnessdate1-illnessdate20;

Then have a third data set that merges them back togther by id:

data finalset; merge set1 set2; by id;

-Mary ----- Original Message ----- From: J. Andrel To: SAS-L@LISTSERV.UGA.EDU Sent: Thursday, January 17, 2008 3:24 PM Subject: Splitting up a variable

Hi,

I have a dataset with 3 fields - an ID, a string containing every nurse office visit for the year, and a string containing all the dates for each visit. It looks something like this:

ID ILLNESS ILLNESSDATE 1 SCRAPE;BRUISE;OTHER 01/01/08;01/02/08;01/15/08 2 3 ASTHMA 12/31/08

etc.

I need to split those ILLNESS and ILLNESSDATE strings into ILLNESS1, ILLNESS2, etc. Some records might not have any illnesses, some have only one, someone have a whole mess of them. I already have code that successfully accomplishes this task, which I'll post below. However, it's currently taking approximately forever to run. I set it to run yesterday at noon, and it's still going, looping though 190,000 records.

My question is this: Is there a faster way to do this? I have at least 3 other files of the same records length that have variables that need to be split apart in the same way, and was hoping to have it done within this decade. Any suggestions would be appreciated.

Thanks, Jocelyn

%let totalrec = 190000;

%macro readsemi(); %let _itarget_=1; %do %while(%quote(%scan(%quote(&target),&_target_,%str(;))) ne); %let target&_itarget_ = %quote(%scan(%quote(&target),&_target_,%str(;))); %let target&_itarget_ = %left(&&target&_itarget_); %let _itarget_ = %eval(&_itarget_ + 1); %end; %let _itarget_ = %eval(&_itarget_ - 1); data y; file print; %do i = 1 %to &_itarget_; id = &id; target = ?o&target?; itarget = &_itarget_; target&i = ?o&&target&i?; %end; run; %mend readsemi;

%macro illness(datain=, dataout=, var=);

%do id =1 %to &totalrec; %let target = %quote(); data x; set &datain; if id = &id; call symput(?~target?T,&var); run;

%readsemi;

data &dataout; set &dataout y; run;

%end;

%mend illness;


Back to: Top of message | Previous page | Main SAS-L page