LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (March 2011, week 5)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Thu, 31 Mar 2011 07:06:34 -0400
Reply-To:     "Kirby, Ted" <ted.kirby@LEWIN.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         "Kirby, Ted" <ted.kirby@LEWIN.COM>
Subject:      Re: Replacing Double Tabs with <tab> <space. <tab>
Comments: To: Nat Wooding <nathani@VERIZON.NET>
In-Reply-To:  A<4CE1D9E650D14720A60FE036A98929EF@D1871RB1>
Content-Type: text/plain; charset="us-ascii"

I am using 9.1.3

-----Original Message----- From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of Nat Wooding Sent: Thursday, March 31, 2011 6:20 AM To: SAS-L@LISTSERV.UGA.EDU Subject: Re: Replacing Double Tabs with <tab> <space. <tab>

Ted

Are you using SAS 9.2 or 9.1.3 or, horrors, V8?

If you have 9.2, you could read the file directly using the Excel Libname engine and avoid these problems. The Excel sheet would appear to SAS as if it were a SAS data set.

Nat Wooding

-----Original Message-----

From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of Kirby, Ted

Sent: Thursday, March 31, 2011 6:11 AM

To: SAS-L@LISTSERV.UGA.EDU

Subject: Replacing Double Tabs with <tab> <space. <tab>

I have a text file that is generated by creating a tab-delimited file from an Excel(r) worksheet. The cells of this worksheet can contain paragraphs of text, complete with periods, commas, and even some special characters such as asterisks. To create a SAS dataset from this

Excel(r) worksheet, I use the _INFILE_ variable and the SCAN function as

follows:

data UploadChk;

infile InData length=RLength lrecl=32767 truncover firstobs=2;

input ;

Line = _infile_;

Award_Number=scan(line,1,'09'x);

Award_Description=scan(line,2,'09'x);

Project_Name_or_Program_Title=scan(line,3,'09'x);

Project_Description=scan(line,4,'09'x);

Time_Stamp_of_Last_Update=scan(line,5,'09'x);

run;

However, since the scan function treats consecutive delimiters as a single delimiter and there may be empty columns, I thought I could use CALL SCAN and Perl(r) Regular Expressions to change occurrences of consecutive tab characters to a tab character, a space, and a tab character as needed as follows:

%let NumCols=5;

data UploadChk;

if _n_=1 then do;

retain dbltab_re;

dbltab_re = prxparse('s/\t\t/\t \t/');

end;

infile InData length=RLength lrecl=32767 truncover firstobs=2;

input ;

Line = _infile_;

/* When there are less than 5 variables in the observation, then CALL SCAN

will return zeros for the position and length parameters. This is the

"signal" that we need to correct the double-delimiter problem. */

call scan(Line,&NumCols.,position,length,'09'x);

if position=0 and length=0 then do;

call prxchange(dbltab_re, -1, line);

end;

Award_Number=scan(line,1,'09'x);

Award_Description=scan(line,2,'09'x);

Project_Name_or_Program_Title=scan(line,3,'09'x);

Project_Description=scan(line,4,'09'x);

Time_Stamp_of_Last_Update=scan(line,5,'09'x);

run;

The problem I have is that the code above replaces the double tabs with "\t \t" string literal, not <tab> <space> <tab> as I thought it would.

Thus it appears that "\t" can *find* tabs, but is not the correct code to *write* tabs.

This is my first foray into Perl(r) and CALL PRXCHANGE. The basis for my code was the code sample in Example 2 of the "Pattern Matching Using SAS Regular Expressions (RX) and Perl Regular Expressions (PRX)" section of the SAS documentation accessed from the Help Menu in SAS 9.1.3.

Obviously, I have much to learn.

Kwperl kwprxparse kwprxchange kwscan kw_infile_

************* IMPORTANT - PLEASE READ ********************

This e-mail, including attachments, may include confidential and/or proprietary information, and may be used only by the person or entity to which it is addressed. If the reader of this e-mail is not the intended recipient or his or her authorized agent, the reader is hereby notified that any dissemination, distribution or copying of this e-mail is prohibited. If you have received this e-mail in error, please notify the sender by replying to this message and delete this e-mail immediately.

************* IMPORTANT - PLEASE READ ********************

This e-mail, including attachments, may include confidential and/or proprietary information,

and may be used only by the person or entity to which it is addressed. If the reader of this

e-mail is not the intended recipient or his or her authorized agent, the reader is hereby

notified that any dissemination, distribution or copying of this e-mail is prohibited. If you

have received this e-mail in error, please notify the sender by replying to this message

and delete this e-mail immediately.


Back to: Top of message | Previous page | Main SAS-L page