LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (March 2008, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Tue, 11 Mar 2008 06:33:39 -0700
Reply-To:     Duh_OZ <ozzy.kopec@GMAIL.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Duh_OZ <ozzy.kopec@GMAIL.COM>
Organization: http://groups.google.com
Subject:      Noobie here - EOF (hex 1A) Q.  (long post)
Comments: To: sas-l@uga.edu
Content-Type: text/plain; charset=ISO-8859-1

Had the programming I & II courses, but their data and real world data is two different things :0)

I wrote a simplistic program to read in multiple files and output a FREQ. The gist is a file may have a hex 1A (EOF) appended onto it (if the file had been edited). Simplistic ASCII flat file - each line has a hex 0D0A (CR/LF) which presents no problem. The problem is the hex 1A which is read in and SAS tries to process the record.

I ended up using the missover option and not releasing the record if unit = missing (true when it reads in the EOF).

Code, output and log below. Am I missing something - I though when SAS hit the 1A it stopped processing the file?

============

Code: options nocenter;

filename in_put 'UNIT?.DAT';

* This data step will read in the hex 1A and part of the first line for the second file (UNITB.DAT); data work.out_file; infile in_put lrecl=15; input @5 unit 4.;

run;

Title 'Counts by unit 1';

proc freq data = work.out_file; tables unit /missing;

run;

* Adding missover solves the 'overread' problem but the EOF's (hex 1A) are read in and counted as missing; data work.out_file; infile in_put lrecl=15 missover; input @5 unit 4.;

run;

Title 'Counts by unit 2';

proc freq data = work.out_file; tables unit /missing;

run;

* Even though unit can never be missing, it seems to be set when the EOF is read in; data work.out_file; infile in_put lrecl=15 missover; input @5 unit 4.; if unit NE .;

run;

Title 'Counts by unit 3';

proc freq data = work.out_file; tables unit /missing;

run;

============== Output: Counts by unit 1

The FREQ Procedure

Cumulative Cumulative unit Frequency Percent Frequency Percent --------------------------------------------------------- 111 1 8.33 1 8.33 3333 3 25.00 4 33.33 4444 3 25.00 7 58.33 5555 5 41.67 12 100.00

Counts by unit 2

The FREQ Procedure

Cumulative Cumulative unit Frequency Percent Frequency Percent --------------------------------------------------------- . 2 14.29 2 14.29 3333 3 21.43 5 35.71 4444 4 28.57 9 64.29 5555 5 35.71 14 100.00

Counts by unit 3

The FREQ Procedure

Cumulative Cumulative unit Frequency Percent Frequency Percent --------------------------------------------------------- 3333 3 25.00 3 25.00 4444 4 33.33 7 58.33 5555 5 41.67 12 100.00

========== log: NOTE: This session is executing on the Linux 2.6.9-67.ELsmp platform.

NOTE: SAS 9.1.3 Service Pack 3

You are running SAS 9. Some SAS 8 files will be automatically converted by the V9 engine; others are incompatible. Please see http://support.sas.com/rnd/migration/planning/platform/64bit.html

PROC MIGRATE will preserve current SAS file attributes and is recommended for converting all your SAS libraries from any SAS 8 release to SAS 9. For details and examples, please see http://support.sas.com/rnd/migration/index.html

This message is contained in the SAS news file, and is presented upon initialization. Edit the file "news" in the "misc/base" directory to display site-specific news and information in the program log. The command line option "-nonews" will prevent this display.

NOTE: SAS initialization used: real time 0.01 seconds cpu time 0.02 seconds

1 options nocenter; 2 3 filename in_put 'UNIT?.DAT'; 4 5 * This data step will read in the hex 1A and part of the first line for the second file (UNITB.DAT); 6 data work.out_file; 7 infile in_put lrecl=15; 8 input @5 unit 4.; 9 10 run;

NOTE: The infile IN_PUT is: File Name=/home/ozzy/UNITA.DAT, File List=/home/ozzy/UNIT?.DAT,Owner Name=ozzy, Group Name=users,Access Permission=rw-rw-r--, File Size (bytes)=49

NOTE: The infile IN_PUT is: File Name=/home/ozzy/UNITB.DAT, File List=/home/ozzy/UNIT?.DAT,Owner Name=ozzy, Group Name=users,Access Permission=rw-rw-r--, File Size (bytes)=64

NOTE: The infile IN_PUT is: File Name=/home/ozzy/UNITC.DAT, File List=/home/ozzy/UNIT?.DAT,Owner Name=ozzy, Group Name=users,Access Permission=rw-rw-r--, 2 The SAS System 08:12 Tuesday, March 11, 2008

File Size (bytes)=81

NOTE: 4 records were read from the infile IN_PUT. The minimum record length was 1. The maximum record length was 15. NOTE: 4 records were read from the infile IN_PUT. The minimum record length was 15. The maximum record length was 15. NOTE: 6 records were read from the infile IN_PUT. The minimum record length was 1. The maximum record length was 15. NOTE: SAS went to a new line when INPUT statement reached past the end of a line. NOTE: The data set WORK.OUT_FILE has 12 observations and 1 variables. NOTE: DATA statement used (Total process time): real time 0.00 seconds cpu time 0.00 seconds

11 12 Title 'Counts by unit 1'; 13 14 proc freq data = work.out_file; 15 tables unit /missing; 16 17 run;

NOTE: There were 12 observations read from the data set WORK.OUT_FILE. NOTE: The PROCEDURE FREQ printed page 1. NOTE: PROCEDURE FREQ used (Total process time): real time 0.01 seconds cpu time 0.01 seconds

18 19 * Adding missover solves the 'overread' problem but the EOF's (hex 1A) are read in and counted as missing; 20 data work.out_file; 21 infile in_put lrecl=15 missover; 22 input @5 unit 4.; 23 24 run;

NOTE: The infile IN_PUT is: File Name=/home/ozzy/UNITA.DAT, File List=/home/ozzy/UNIT?.DAT,Owner Name=ozzy, Group Name=users,Access Permission=rw-rw-r--, File Size (bytes)=49

NOTE: The infile IN_PUT is: File Name=/home/ozzy/UNITB.DAT, File List=/home/ozzy/UNIT?.DAT,Owner Name=ozzy, Group Name=users,Access Permission=rw-rw-r--, File Size (bytes)=64

NOTE: The infile IN_PUT is: File Name=/home/ozzy/UNITC.DAT, File List=/home/ozzy/UNIT?.DAT,Owner Name=ozzy, Group Name=users,Access Permission=rw-rw-r--, File Size (bytes)=81 3 The SAS System 08:12 Tuesday, March 11, 2008

NOTE: 4 records were read from the infile IN_PUT. The minimum record length was 1. The maximum record length was 15. NOTE: 4 records were read from the infile IN_PUT. The minimum record length was 15. The maximum record length was 15. NOTE: 6 records were read from the infile IN_PUT. The minimum record length was 1. The maximum record length was 15. NOTE: The data set WORK.OUT_FILE has 14 observations and 1 variables. NOTE: DATA statement used (Total process time): real time 0.00 seconds cpu time 0.00 seconds

25 26 Title 'Counts by unit 2'; 27 28 proc freq data = work.out_file; 29 tables unit /missing; 30 31 run;

NOTE: There were 14 observations read from the data set WORK.OUT_FILE. NOTE: The PROCEDURE FREQ printed page 2. NOTE: PROCEDURE FREQ used (Total process time): real time 0.00 seconds cpu time 0.01 seconds

32 33 * Even though unit can never be missing, it seems to be set when the EOF is read in; 34 data work.out_file; 35 infile in_put lrecl=15 missover; 36 input @5 unit 4.; 37 if unit NE .; 38 39 run;

NOTE: The infile IN_PUT is: File Name=/home/ozzy/UNITA.DAT, File List=/home/ozzy/UNIT?.DAT,Owner Name=ozzy, Group Name=users,Access Permission=rw-rw-r--, File Size (bytes)=49

NOTE: The infile IN_PUT is: File Name=/home/ozzy/UNITB.DAT, File List=/home/ozzy/UNIT?.DAT,Owner Name=ozzy, Group Name=users,Access Permission=rw-rw-r--, File Size (bytes)=64

NOTE: The infile IN_PUT is: File Name=/home/ozzy/UNITC.DAT, File List=/home/ozzy/UNIT?.DAT,Owner Name=ozzy, Group Name=users,Access Permission=rw-rw-r--, File Size (bytes)=81

4 The SAS System 08:12 Tuesday, March 11, 2008

NOTE: 4 records were read from the infile IN_PUT. The minimum record length was 1. The maximum record length was 15. NOTE: 4 records were read from the infile IN_PUT. The minimum record length was 15. The maximum record length was 15. NOTE: 6 records were read from the infile IN_PUT. The minimum record length was 1. The maximum record length was 15. NOTE: The data set WORK.OUT_FILE has 12 observations and 1 variables. NOTE: DATA statement used (Total process time): real time 0.00 seconds cpu time 0.00 seconds

40 41 Title 'Counts by unit 3'; 42 43 proc freq data = work.out_file; 44 tables unit /missing; 45 46 run;

NOTE: There were 12 observations read from the data set WORK.OUT_FILE. NOTE: The PROCEDURE FREQ printed page 3. NOTE: PROCEDURE FREQ used (Total process time): real time 0.00 seconds cpu time 0.00 seconds

47

NOTE: SAS Institute Inc., SAS Campus Drive, Cary, NC USA 27513-2414 NOTE: The SAS System used: real time 0.04 seconds cpu time 0.04 seconds


Back to: Top of message | Previous page | Main SAS-L page