Date: Wed, 8 Sep 1999 19:09:01 +0100
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Peter Crawford <peter.crawford@DB.COM>
Subject: Re: Very large Number of observations- 480,000,000
Content-type: text/plain; charset=us-ascii
*****hoping this isn't a duplicate***********
I haven't a clue about the performance of alternatives to SAS
(like Oracle) for data management. I wouldn't wish to re-start
that recent discussion - see the archives...
On a specific point regarding your data volume, be aware that
SAS will be subject to a limit on data set size for your platform
= 2GB on fat drives (and 4GB on NTFS I believe..
. please confirm if relevant)
That limits each transaction to less than 10 bytes of data at best!
So I think you will need to organise subsets and joins.
You're going to need some fast I/O for *any* platform.
/half-plug for SAS
I think you don't need a general purpose data manager
like Oracle, but a high performance disk channel
Keep that data on separate drive(s) from the op sys and SAS system
/full-plug for SAS
you'll need SAS
good luck and let us know how it goes
Datum: 08.09.99 18:23
Antwort an: email@example.com
Betreff: Very large Number of observations - 480,000,000
I have a dataset consisting of 6 variables and (potentially) 480,000,000
observations (basically transactions).
This has been normalised. Aside from the obvious point of cutting the data
set into chunks, say of
100,000,000 observations, I just thought I would ask if there are any show
stoppers that I ought to
be aware of. So I would be most happy if someone were to say - ah no no no
you can't do that before I
do the data mungeing.
In the longer term it would be sensible to transfer the data to a 'real'
database such as
Oracle. In the short run I will be using SAS as data storage mechanism and
using proc sql to
extract subsets and join with other smaller tables ( the biggest of which is
about 300,000 rows).
The SAS is version 6.12, WinNT 4 , Pentium 450 with 256 meg ram.