| Date: | Tue, 21 Dec 1999 14:01:46 +0100 |
| Reply-To: | Jim Groeneveld <J.Groeneveld@ITGROUPS.COM> |
| Sender: | "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU> |
| From: | Jim Groeneveld <J.Groeneveld@ITGROUPS.COM> |
| Subject: | Re: reading in blocks of records |
|
| Content-Type: | text/plain; charset="iso-8859-1" |
|---|
Yanming X. Xu,
I have the impression that the data, as contained in your email (see
attached below), is garbled by putting it in an email message that way. Even
after displaying all of it with a non-proportional (monospace) font doesn't
produce neat columns where every item is right below the previous one. That
makes it difficult for me to check for a correct INPUT statement. However, I
can make some general remarks, that are applicable anyway.
You use the # line pointer controls to specify which line to read variables
from for a multiple records case. Is that correct? If this is not what you
intended, then you should not use line pointer control. If it is, you might
add N=... to the INFILE statement to indicate the (maximum) number of
records per case. But I think this isn't at all what you intend. You now one
read two values from line 5, two from 14, again two from 19 (the second
case) and two from 28, skipping all the rest. You probably don't want to
skip all the rest, do you?
Your data, as presented in your email, is a mixture of different tables and
additional irrelevant information, certainly not a rectangular data file
shape with equally formatted records (per line or lines), necessary to read
easily. I would like to divide your data in the following 4 parts, called
Part1, Part2, Part3 and Part4, and add line numbers (for clearity only, not
really). These parts have to make up separate files actually, they can not
be a single file all together. So, assuming the following four files:
Part1
1 accout_no cd_store pemium ins_start ins_stop
2 101234567899 123456789 1,234,560 1994-01-23 1999-12-31
3 212345678910 234567891 0.13 1995-02-23 9999-12-31
4 345678912345 345678912 56.70 1999-11-21 9999-12-31
5 144567891234 456789123 7.28 1999-02-23 9999-12-31
Part2
1 curr_mm status
2 02 t
3 02 f
4 02 f
5 02 f
Part3
1 accout_no cd_store pemium ins_start ins_stop
2 001234567899 123456789 5.88 1994-01-23 1999-12-31
3 002345678910 234567891 0.27 1995-02-23 9999-12-31
4 003456789123 345678912 56.43 1999-11-21 9999-12-31
5 144567891234 456789123 8.27 1999-02-23 9999-12-31
Part4
1 curr_mm status
2 03 t
3 03 f
4 04 f
5 05 f
Now you have two times five records for which you want to read data, is that
correct? I so, your code might be for example (untested!):
FILENAME Part1 'a:\Part1.txt';
DATA Part1;
INFILE Part1 DLM='09'x FIRSTOBS=2 TRUNCOVER; * Are you sure delimiters
consist of tabs? You don't need delimiters at all of you specify exact
positions using @ column pointers and exact informats. FIRSTOBS=2 skips the
first line with field names. I actually don't think you need the TRUNCOVER
in this instance, but it doesn't harm;
INPUT @1 account $12. @14 store $9.;
RUN;
FILENAME Part2 'a:\Part2.txt';
DATA Part2;
INFILE Part2 DLM='09'x FIRSTOBS=2 TRUNCOVER; * See former comments;
INPUT @1 month $2. @17 status $1.; * Status is in position 17, I think;
RUN;
FILENAME Part3 'a:\Part3.txt';
DATA Part3;
INFILE Part3 DLM='09'x FIRSTOBS=2 TRUNCOVER; * See former comments;
INPUT @1 account $12. @14 store $9.;
RUN;
FILENAME Part4 'a:\Part4.txt';
DATA Part4;
INFILE Part4 DLM='09'x FIRSTOBS=2 TRUNCOVER; * See former comments;
INPUT @1 month $2. @9 status $1.;
RUN;
DATA Part1_3;
SET Part1 Part3; * combine Part1 and and Part 3, add Part3 below Part1;
RUN;
DATA Part2_4;
SET Part2 Part4; * combine Part2 and and Part 4, add Part2 below Part4;
RUN;
DATA AllParts;
MERGE Part1_3 Part2_4; * Combine both halves, adding the second one to
the right of the first one. No use of the BY statement, thus combined in
sequence (as they are);
RUN;
The workfile AllParts now contains you desired result, at least that is how
I interprete your problem. Please correct me if this is not what you mean. I
hope this has been helping. More different solutions would be possible. For
example using a rather complicated data step (with conditionals) in which
you read using INPUT it might be possible to keep all raw data in one file,
but I would not recommend that.
If this all is not what you would expect, or what you intended, then please
respecify your question some more detailed, including what you actually want
the result to be, and the result (PRINT and LOG) as produced, in order to be
able to see what actually is going wrong.
Regards - Jim.
--
Y. (Jim) Groeneveld, MSc IMRO TRAMARKO tel. +31 412 407 070
senior statistician, P.O. Box 1 fax. +31 412 407 080
head IT department 5350 AA BERGHEM IMRO TRAMARKO: a CRO
J.Groeneveld@ITGroups.com the Netherlands in clinical research
I wish you a merry Christmas and a happy, compatible y²°°°
"My job is to keep my computer working." - Jim Groeneveld
> -----Original Message-----
> From: Yanming X. Xu [SMTP:yxxu@HOUSEHOLD.COM]
> Sent: Monday, December 20, 1999 10:14 PM
> To: SAS-L@LISTSERV.UGA.EDU
> Subject: reading in blocks of records
>
> Dear SAS-L,
> Sorry to take you out of busy schedule. The following is the result I ran
> from
> query.
> 112/17 14:30 platinum report facility
> ----------------------------------------------------------------
> -----------
> accout_no cd_store pemium ins_start ins_stop
> 101234567899 123456789 1,234,560 1994-01-23 1999-12-31
> 212345678910 234567891 0.13 1995-02-23 9999-12-31
> 345678912345 345678912 56.70 1999-11-21 9999-12-31
> 144567891234 456789123 7.28 1999-02-23 9999-12-31
> 11:20:43pm platinum report facility
> -------------------------------------------------------------
> -----------
> curr_mm status
> 02 t
> 02 f
> 02 f
> 02 f
> 112/17 14:30 platinum report facility
> --------------------------------------------------------
> -----------
> accout_no cd_store pemium ins_start ins_stop
> 001234567899 123456789 5.88 1994-01-23 1999-12-31
> 002345678910 234567891 0.27 1995-02-23 9999-12-31
> 003456789123 345678912 56.43 1999-11-21 9999-12-31
> 144567891234 456789123 8.27 1999-02-23 9999-12-31
> 11:20:43pm platinum report facility
> -----------------------------------------------------------------
> -----------
> curr_mm status
> 03 t
> 03 f
> 04 f
> 05 f
> Then I tried to read into SAS. Here is my unsuccessful code:
> filename hrsi 'a:\try hrsi.txt';
> data hrsi;
> infile hrsi dlm='09'x truncover;
> input #5 @1 account $12. @14 store $9.
> #14 @1 month $2. @9 status $1.;
> /************************************************;***********
> ** I also tried: input ////@1 account $12. @14...... **
> ** ////////////@1 ...... ;
> **
> ** Didn't work too!
> **
> ************************************************************/
> run;
>
> LOG: NOTE: The infile HRSI is:
> FILENAME=a:\try hrsi.txt,
> RECFM=V,LRECL=256
>
> NOTE: 32 records were read from the infile HRSI.
> The minimum record length was 9.
> The maximum record length was 56.
> NOTE: The data set WORK.HRSI has 2 observations and 4 variables.
> NOTE: The DATA statement used 0.13 seconds.
>
> Why I'm only grting one correct obs.? Any help would be appreciated!
|