| Date: | Mon, 23 Jun 2008 14:20:50 -0700 |
| Reply-To: | "Duell, Bob" <BD9439@ATT.COM> |
| Sender: | "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU> |
| From: | "Duell, Bob" <BD9439@ATT.COM> |
| Subject: | Re: Reading custom flat file Problem |
|
| In-Reply-To: | A<fd44bf65-ab30-4e2a-895d-bfef52f31c4c@f63g2000hsf.googlegroups.com> |
| Content-Type: | text/plain; charset="US-ASCII" |
Although there are lots of wizards and warlock out there that try to
figure out how to turn external files into SAS datasets, I think it's
best to learn how to do it using plain old SAS code. The discipline
required to understand the source data from the beginning will be well
worth it.
Here's a sample SAS program to read your file using LIST input:
data b;
attrib PATIENT_ID informat=$8. label='Patient ID';
attrib NAME informat=$20. label='Patient Name';
attrib SEX informat=$1. label='Gender';
attrib AGE informat=3. label='Patient Age';
attrib STATE informat=$2. label='State Name';
attrib COUNTRY informat=$3. label='Country';
infile 'c:\temp\patients.csv' firstobs=2 delimiter=','
end=eof truncover;
input @; /* Test to find last record */
if not eof then
input PATIENT_ID NAME SEX AGE STATE COUNTRY;
else do;
input rec_label $ record_count 4.;
put rec_label= record_count=;
delete;
end;
drop rec_label record_count;
run;
Notice it is very much customized to what you describe (data starts on
record two and the last line is really not data). The END= option sets
the named variable to "one" when the current record is the last line in
the file. The "INPUT @;" doesn't really read anything; it's just there
to initialize the END variable.
I like to use the ATTRIB statement in all programs like this. It's a
bit more typing, but it self-documents the program.
Good luck,
Bob
-----Original Message-----
From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of
Gobjuka
Sent: Monday, June 23, 2008 12:01 PM
To: SAS-L@LISTSERV.UGA.EDU
Subject: Reading custom flat file Problem
I have a csv delimited flat file as follows
The first row contains the header information and the last row
contains the record count.
PATIENT_ID,NAME,SEX,AGE,STATE,COUNTRY
A2934990,Sam J,M,45,MI,US
A2934991,Mark Fisher,M,29,CA,US
A2934992,Robert M,M,64,OR,US
A2934993,Linda K,F,21,MN,US
REC,4
The header defines the order of data.
1) I have looked at examples of how to read flat files but was not
sure
as how do you define the input variables (header dependent).
2) How to track the last record on the input file ??
Any pointers or directions will be helpful.
Thank you in advance,
Gobjuka
|