LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (May 1999, week 4)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Tue, 25 May 1999 16:14:34 +0100
Reply-To:   Peter Crawford <Peter@CRAWFORDSOFTWARE.DEMON.CO.UK>
Sender:   "SAS(r) Discussion" <SAS-L@UGA.CC.UGA.EDU>
From:   Peter Crawford <Peter@CRAWFORDSOFTWARE.DEMON.CO.UK>
Subject:   Re: How to write to many different data sets? (A clarification)
In-Reply-To:   <19990525132628.35250.qmail@hotmail.com>

Avie Lester <avie_lester@HOTMAIL.COM> writes >>OK, before I saw Ian Whitlock's solution I was expecting to create >>something similar. >> >>But you have further needs and so have improved requirements description >>a little..... well enough for this gremlin of an idea >> >>Why have those separate datasets anyway? >> >>Just have one. >>When you want to use it just put the year into a where clause instead of >>the name ! >> >>It sounds like simpler admin. >> trade-off >> set year&year .... >> against >> set allthat( where year="&year" ) ...... >> >>The second form will "bite back", only seldom (e.g. where with obs=) >> >>Otherwise, >>you will need to know what years will arise before reading the input ! >>The names on the DATA statement are required at compile time .... before >>any data in that step is input. > >Hi Peter, here is why there are separate data sets. I know there >is a size limit on data sets. A week's worth of data consumes ~150,000KB. If >the same data set is continually appended to, >then eventually that limit will be reached. For this reason, >it is safer to split the data by day. (I used year instead of >day in the example.) > >I would prefer one huge data set, but this could potentially >cause problems down the road... Avie >

This helps. I'm still guessing that the number of separate data sets to create may become a significant overhead. The output is expected to append to the day/week/whatever dataset. The origin would be external. A data step would need to recreate all data sets named on the data statement (unless you wish to use the MODIFY statement to append and update-in-place). To append to daily files offers less challenge than directly outputting to new files for each day when reading the external data. I'd suggest that when converting from the original external data, you create one work data set including the segregating variable (year/day). Then with similar logic to Ian's demo, append to the daily files as far as necessary, applying a where clause to select the relevant day e.g.

proc append base= day&day out= orig_all(where=( day=&day )) force; run;

If data volumes permit, creating an index on day, while reading the original external data, should enable more rapid where clause subsetting for those proc append.

-- Peter Crawford (_knowledge_ is a poor substitute for *real* experience, but they make a great team)


Back to: Top of message | Previous page | Main SAS-L page