Date: Fri, 13 Feb 2009 21:01:56 -0500
Reply-To: Lou <lpogoda@HOTMAIL.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Lou <lpogoda@HOTMAIL.COM>
Organization: A noiseless patient Spider
Subject: Re: Is there no other way?
Just some nits to pick, interspersed below:
> On Thu, 12 Feb 2009 18:15:57 -0500, Robbie Shan
<ravi.m.shanbhag@GMAIL.COM>
> wrote:
>
> >Hi,
> >
> > I am a newbie to SAS and though I have had considerable experience in
> >programming with other OO languagues,
SAS isn't an OO language, though some parts do make use of objects.
> >I am having some trouble trying to
> >understand how SAS works fundamentally..
> >
> > I have read quite some literature about it but am still baffled...
> >
> > From what I know, if SAS is to work on data, it needs to pull that data
> >into its native data set else it cant work on it. Is my understanding
> >correct?
No. Or maybe better, partly.
SAS needs to read a file in order to process the data in it. If your
application is such that you need to read in a line from an external file
(say, a text file), perform some operations on it, and write the result back
out to some external (again, say a text file) file, repeating until in the
input file is exhausted, no SAS dataset need be created.
If you want to use the built in procedures, called PROCs, your data must be
in a SAS dataset. A PROC generally operates on multiple records (called
observations in SAS). For instance, PROC SORT will sort a file, but both
the input to and the output from that procedure must be in the form of a SAS
dataset.
So usually, it's more convenient to convert your data to SAS dataset form,
even for straightforward applications that read in, manipulate, and write
out. But it's not always necessary.
> > The reason I am asking this is because, if I were to work on millons of
> >records, I cannot think of importing them into my data set while I run
> >some analysis on it. This for some reason seems very inefficient to me.
Indeed you can't - you import them **before** you run your analyses, not
while.
> > Ideally, I should be able to run some logic against the data (say a
> >warehouse) and then post it back into the warehouse without having to
> >store it on my computer!
If your data are in a data warehouse, you've apparently imported them to
your warehouse without seeing anything inefficient about that. If the
warehouse is, say, based on ORACLE, you could do your processing using
ORACLE, and bypass the step on converting the data to a form SAS
understands. Conversely, if your processing is going to be done in SAS,
maybe your warehouse should be in SAS to start with.
From a pure efficiency standpoint, whether you're temporarily storing
millions of records on your desktop or not is almost beside the point. If
you're processing millions of records that are residing in some DBMS,
pulling all that data to your machine over the network and pushing it back
out to the DBMS, again over the network, can run into some significant
overhead. You'd possibly be better off pushing the code (what, a few dozen
or even a few hundred lines) over the network to the DBMS and let the
processing take place there. If you want to use SAS to do that, take a look
at the documentation for the SQL PASSTHRU facility.
|