Date: Tue, 13 Mar 2012 17:45:52 -0400
Reply-To: Tom Abernathy <tom.abernathy@GMAIL.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Tom Abernathy <tom.abernathy@GMAIL.COM>
Subject: Re: Infile FORTRAN Generated Data Files...
The columns looked fixed in the sample you posted.
Why not just tell SAS to only read the column you need? Then it can ignore
the junk on the end of the last line.
input y1 2. (y2-y10) (7.);
On Tue, 13 Mar 2012 14:38:36 -0700, Jordan, Lewis
<Lewis.Jordan@WEYERHAEUSER.COM> wrote:
>Great. The format "fix" didn't work. I think you all can see the logical
flow of the data. I have 21 rows of numeric data (y1 corresponds to row
number) and then the oddity. For some reason the two rows are getting put
onto the same row. (y1=10 and y1=20). Simply put your cursor in front of
"10" and hit return. Do the same for "20". Should put the data in the
correct format...
>
>Sorry for the trouble
>
>
>
>I have to read in ~ 1000 Fortran generated text files. At the end of each
file is a string of characters (who knows why?).
>
>The first set of code below does what I want, and when it hits the
character string, it goes to a new line so this isn't a problem.
>
>Data in;
>infile datalines firstobs=2;
>input y1-y10;
>datalines;
>
> 1 0.0900 0.04036 0.4484 0.6977 0.2509 0.2977 0.6368 64.822 44.444
> 2 0.1350 0.06206 0.4597 0.7602 0.2633 0.3194 0.6352 63.027 44.444
> 3 0.1550 0.07821 0.5046 0.7385 0.2496 0.2955 0.6556 76.871 58.065
> 4 0.1950 0.10221 0.5241 0.7567 0.3490 0.3932 0.6485 64.752 51.282
> 5 0.1700 0.08343 0.4908 0.7964 0.1507 0.3083 0.7514 65.304 41.176
> 6 0.2500 0.14174 0.5669 0.7672 0.3123 0.3538 0.6868 78.788 64.000
> 7 0.2200 0.12166 0.5530 0.7686 0.2494 0.3027 0.6960 81.435 63.636
> 8 0.1650 0.09376 0.5683 0.7497 0.2859 0.3292 0.6722 83.507 69.697
> 9 0.0850 0.05249 0.6175 0.8023 0.3107 0.3905 0.7120 82.458 70.588
>10 0.0850 0.04646 0.5466 0.8021 0.2780 0.3406 0.7297 72.429 52.941
>11 0.0800 0.03924 0.4906 0.8179 0.2809 0.3266 0.7013 64.463 43.750
>12 0.1200 0.06710 0.5592 0.7839 0.2773 0.3326 0.7210 76.779 58.333
>13 0.2050 0.10839 0.5287 0.7421 0.2633 0.3308 0.6836 74.048 56.098
>14 0.2500 0.12503 0.5001 0.7490 0.2557 0.3006 0.6996 71.758 50.000
>15 0.1800 0.08129 0.4516 0.8193 0.2530 0.2947 0.7294 60.628 36.111
>16 0.2500 0.11150 0.4460 0.7910 0.2682 0.3278 0.6973 52.062 32.000
>17 0.2100 0.10153 0.4835 0.7992 0.2634 0.3169 0.7285 63.127 40.476
>18 0.2800 0.12426 0.4438 0.7877 0.2714 0.3286 0.6510 54.155 35.714
>19 0.2550 0.11995 0.4704 0.7877 0.2682 0.3092 0.6828 64.520 43.137
>20 0.4500 0.17355 0.3857 0.6451 0.2738 0.3436 0.5804 27.778 17.778
>21 0.4650 0.17588 0.3782 0.5904 0.3080 0.3603 0.5458 14.458 9.677
>->->->->->->->->->->->->->->->->
>;
>run;
>
>
>However, I've got to read in ~ 1000 of these files and I need to automate
it, so
>The problem is when specifying the file location and using an "infile"
statement, I'm losing the very last value in the last row. So in the data
above, I'm losing the value (y10 = 9.677). I've tried everything to get
around this issue!! Please Help!!!
>
>
>data read;
>infile "C:\temp\A1106301.DAT" firstobs=2;
>input y1-y10;
>run;
>
>
>2478 data read;
>2479 infile " C:\temp\A1106301.DAT " firstobs=2;
>2480 input y1-y10;
>2481 run;
>
>NOTE: The infile " C:\temp\A1106301.DAT " is:
>
> Filename= C:\temp\A1106301.DAT,
> RECFM=V,LRECL=256,File Size (bytes)=1470,
> Last Modified=13Mar2012:16:04:08,
> Create Time=13Mar2012:16:03:24
>
>NOTE: Invalid data for y10 in line 22 62-109.
>RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----
+----7----+----8----+---
>
>22 CHAR 21 0.4650 0.17588 0.3782 0.5904 0.3080 0.3603 0.5458 14.458
9.677.->->->->->->->->->->-
> ZONE
3323233332323333323233332323333232333323233332323333233233322323330232323232
323232323232
> NUMR
2100E465000E1758800E378200E590400E308000E360300E5458014E458009E677DDEDEDEDED
EDEDEDEDEDED
> 89 >->->->->->->->->->-> 109
>y1=21 y2=0.465 y3=0.17588 y4=0.3782 y5=0.5904 y6=0.308 y7=0.3603 y8=0.5458
y9=14.458 y10=.
>_ERROR_=1 _N_=21
>NOTE: 21 records were read from the infile "Y:\Data from Greg Leaf\WQ Data
Base
> Information\_Cavenham core X-ray\DAT\A1106301 - Copy.DAT".
> The minimum record length was 66.
> The maximum record length was 109.
>NOTE: The data set WORK.READ has 21 observations and 10 variables.
>NOTE: DATA statement used (Total process time):
> real time 4.66 seconds
> cpu time 0.03 seconds
>
>
>
>
>
>
>*****************************
>Lewis Jordan
>Weyerhaeuser:
>Southern Timberlands R&D
>Cell (Primary): 662-889-4514
>Office: 662-245-5227
>lewis.jordan@weyerhaeuser.com
>*****************************
>
>
>-----Original Message-----
>From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of Paul
Dorfman
>Sent: Tuesday, March 13, 2012 12:48 PM
>To: SAS-L@LISTSERV.UGA.EDU
>Subject: Re: conditioning on value of 114 consecutive variables
>
>I understand per your follow-up post that you want a WHERE, rather that IF,
>clause, so that it could be used to filter input to a procedure. As Mike
>said, the OF list construct cannot be used in a WHERE. However, you can
>create a view and store it in a permanent library without impacting your
>disk footprint, then referencing the view in any ensuing proc instead, for
>example
>
>data xv / view = xv ;
> set x ;
> array AE ae_other_1 - ae_other_114 ;
> have_data = cmiss (of ae[*]) ne dim (ae) ;
>run ;
>
>proc print data = xv ;
> where have_data ;
>run ;
>
>You would need to create the view only once, since as X gets updated, it
>will be auto-reflected in the data the view will surface.
>
>Kind regards
>------------
>Paul Dorfman
>Jax, FL
>------------
>
>
>
>On Tue, 13 Mar 2012 03:57:55 -0400, CP Jen <plessthanpointohfive@GMAIL.COM>
>wrote:
>
>>Hi SAS-L;
>>
>>I have a data set that is one-to-one for >4000 individuals. There is are
>>variables called ae_other_1 to ae_other_114.
>>
>>I would like to select observations what have data in at least one of the
>114
>>ae_other variables. Some folks will have zero data in any of the 114
>>variables. In fact, many of them will not have that data. I want to
>exclude
>>those.
>>
>>Of course, I could use syntax like, " if ae_other_1 ne . | ae_other_2 ne .
>|
>>etc. But that will take forever.
>>
>>Is there a way to do this without reffing each variable individually, such
>>how we can use "ae_other_1-ae_other_114" for a proc print, means, or freq?
>>
>>Thanks,
>>
>>Jen
|