|
Had the programming I & II courses, but their data and real world data
is two different things :0)
I wrote a simplistic program to read in multiple files and output a
FREQ. The gist is a file may have a hex 1A (EOF) appended onto it
(if the file had been edited). Simplistic ASCII flat file - each
line has a hex 0D0A (CR/LF) which presents no problem. The problem
is the hex 1A which is read in and SAS tries to process the record.
I ended up using the missover option and not releasing the record if
unit = missing (true when it reads in the EOF).
Code, output and log below. Am I missing something - I though when
SAS hit the 1A it stopped processing the file?
============
Code:
options nocenter;
filename in_put 'UNIT?.DAT';
* This data step will read in the hex 1A and part of the first line
for the second file (UNITB.DAT);
data work.out_file;
infile in_put lrecl=15;
input @5 unit 4.;
run;
Title 'Counts by unit 1';
proc freq data = work.out_file;
tables unit /missing;
run;
* Adding missover solves the 'overread' problem but the EOF's (hex 1A)
are read in and counted as missing;
data work.out_file;
infile in_put lrecl=15 missover;
input @5 unit 4.;
run;
Title 'Counts by unit 2';
proc freq data = work.out_file;
tables unit /missing;
run;
* Even though unit can never be missing, it seems to be set when the
EOF is read in;
data work.out_file;
infile in_put lrecl=15 missover;
input @5 unit 4.;
if unit NE .;
run;
Title 'Counts by unit 3';
proc freq data = work.out_file;
tables unit /missing;
run;
==============
Output:
Counts by unit 1
The FREQ Procedure
Cumulative Cumulative
unit Frequency Percent Frequency Percent
---------------------------------------------------------
111 1 8.33 1 8.33
3333 3 25.00 4 33.33
4444 3 25.00 7 58.33
5555 5 41.67 12 100.00
Counts by unit 2
The FREQ Procedure
Cumulative Cumulative
unit Frequency Percent Frequency Percent
---------------------------------------------------------
. 2 14.29 2 14.29
3333 3 21.43 5 35.71
4444 4 28.57 9 64.29
5555 5 35.71 14 100.00
Counts by unit 3
The FREQ Procedure
Cumulative Cumulative
unit Frequency Percent Frequency Percent
---------------------------------------------------------
3333 3 25.00 3 25.00
4444 4 33.33 7 58.33
5555 5 41.67 12 100.00
==========
log:
NOTE: This session is executing on the Linux 2.6.9-67.ELsmp platform.
NOTE: SAS 9.1.3 Service Pack 3
You are running SAS 9. Some SAS 8 files will be automatically
converted
by the V9 engine; others are incompatible. Please see
http://support.sas.com/rnd/migration/planning/platform/64bit.html
PROC MIGRATE will preserve current SAS file attributes and is
recommended for converting all your SAS libraries from any
SAS 8 release to SAS 9. For details and examples, please see
http://support.sas.com/rnd/migration/index.html
This message is contained in the SAS news file, and is presented upon
initialization. Edit the file "news" in the "misc/base" directory to
display site-specific news and information in the program log.
The command line option "-nonews" will prevent this display.
NOTE: SAS initialization used:
real time 0.01 seconds
cpu time 0.02 seconds
1 options nocenter;
2
3 filename in_put 'UNIT?.DAT';
4
5 * This data step will read in the hex 1A and part of the
first line for the second file (UNITB.DAT);
6 data work.out_file;
7 infile in_put lrecl=15;
8 input @5 unit 4.;
9
10 run;
NOTE: The infile IN_PUT is:
File Name=/home/ozzy/UNITA.DAT,
File List=/home/ozzy/UNIT?.DAT,Owner Name=ozzy,
Group Name=users,Access Permission=rw-rw-r--,
File Size (bytes)=49
NOTE: The infile IN_PUT is:
File Name=/home/ozzy/UNITB.DAT,
File List=/home/ozzy/UNIT?.DAT,Owner Name=ozzy,
Group Name=users,Access Permission=rw-rw-r--,
File Size (bytes)=64
NOTE: The infile IN_PUT is:
File Name=/home/ozzy/UNITC.DAT,
File List=/home/ozzy/UNIT?.DAT,Owner Name=ozzy,
Group Name=users,Access Permission=rw-rw-r--,
2 The SAS
System
08:12 Tuesday, March 11, 2008
File Size (bytes)=81
NOTE: 4 records were read from the infile IN_PUT.
The minimum record length was 1.
The maximum record length was 15.
NOTE: 4 records were read from the infile IN_PUT.
The minimum record length was 15.
The maximum record length was 15.
NOTE: 6 records were read from the infile IN_PUT.
The minimum record length was 1.
The maximum record length was 15.
NOTE: SAS went to a new line when INPUT statement reached past the end
of a line.
NOTE: The data set WORK.OUT_FILE has 12 observations and 1 variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
11
12 Title 'Counts by unit 1';
13
14 proc freq data = work.out_file;
15 tables unit /missing;
16
17 run;
NOTE: There were 12 observations read from the data set WORK.OUT_FILE.
NOTE: The PROCEDURE FREQ printed page 1.
NOTE: PROCEDURE FREQ used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
18
19 * Adding missover solves the 'overread' problem but the
EOF's (hex 1A) are read in and counted as missing;
20 data work.out_file;
21 infile in_put lrecl=15 missover;
22 input @5 unit 4.;
23
24 run;
NOTE: The infile IN_PUT is:
File Name=/home/ozzy/UNITA.DAT,
File List=/home/ozzy/UNIT?.DAT,Owner Name=ozzy,
Group Name=users,Access Permission=rw-rw-r--,
File Size (bytes)=49
NOTE: The infile IN_PUT is:
File Name=/home/ozzy/UNITB.DAT,
File List=/home/ozzy/UNIT?.DAT,Owner Name=ozzy,
Group Name=users,Access Permission=rw-rw-r--,
File Size (bytes)=64
NOTE: The infile IN_PUT is:
File Name=/home/ozzy/UNITC.DAT,
File List=/home/ozzy/UNIT?.DAT,Owner Name=ozzy,
Group Name=users,Access Permission=rw-rw-r--,
File Size (bytes)=81
3 The SAS
System
08:12 Tuesday, March 11, 2008
NOTE: 4 records were read from the infile IN_PUT.
The minimum record length was 1.
The maximum record length was 15.
NOTE: 4 records were read from the infile IN_PUT.
The minimum record length was 15.
The maximum record length was 15.
NOTE: 6 records were read from the infile IN_PUT.
The minimum record length was 1.
The maximum record length was 15.
NOTE: The data set WORK.OUT_FILE has 14 observations and 1 variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
25
26 Title 'Counts by unit 2';
27
28 proc freq data = work.out_file;
29 tables unit /missing;
30
31 run;
NOTE: There were 14 observations read from the data set WORK.OUT_FILE.
NOTE: The PROCEDURE FREQ printed page 2.
NOTE: PROCEDURE FREQ used (Total process time):
real time 0.00 seconds
cpu time 0.01 seconds
32
33 * Even though unit can never be missing, it seems to be set
when the EOF is read in;
34 data work.out_file;
35 infile in_put lrecl=15 missover;
36 input @5 unit 4.;
37 if unit NE .;
38
39 run;
NOTE: The infile IN_PUT is:
File Name=/home/ozzy/UNITA.DAT,
File List=/home/ozzy/UNIT?.DAT,Owner Name=ozzy,
Group Name=users,Access Permission=rw-rw-r--,
File Size (bytes)=49
NOTE: The infile IN_PUT is:
File Name=/home/ozzy/UNITB.DAT,
File List=/home/ozzy/UNIT?.DAT,Owner Name=ozzy,
Group Name=users,Access Permission=rw-rw-r--,
File Size (bytes)=64
NOTE: The infile IN_PUT is:
File Name=/home/ozzy/UNITC.DAT,
File List=/home/ozzy/UNIT?.DAT,Owner Name=ozzy,
Group Name=users,Access Permission=rw-rw-r--,
File Size (bytes)=81
4 The SAS
System
08:12 Tuesday, March 11, 2008
NOTE: 4 records were read from the infile IN_PUT.
The minimum record length was 1.
The maximum record length was 15.
NOTE: 4 records were read from the infile IN_PUT.
The minimum record length was 15.
The maximum record length was 15.
NOTE: 6 records were read from the infile IN_PUT.
The minimum record length was 1.
The maximum record length was 15.
NOTE: The data set WORK.OUT_FILE has 12 observations and 1 variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
40
41 Title 'Counts by unit 3';
42
43 proc freq data = work.out_file;
44 tables unit /missing;
45
46 run;
NOTE: There were 12 observations read from the data set WORK.OUT_FILE.
NOTE: The PROCEDURE FREQ printed page 3.
NOTE: PROCEDURE FREQ used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
47
NOTE: SAS Institute Inc., SAS Campus Drive, Cary, NC USA 27513-2414
NOTE: The SAS System used:
real time 0.04 seconds
cpu time 0.04 seconds
|