Date: Sun, 24 Jan 2010 22:23:51 -0800
Reply-To: Dale McLerran <stringplayer_2@YAHOO.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Dale McLerran <stringplayer_2@YAHOO.COM>
Subject: Re: Is Regression Using Proc IML Faster?
In-Reply-To: <FEE685C811A7E44AAD17E47B2A966E29BC40F6F290@KITE.wharton.upenn.edu>
Content-Type: text/plain; charset=iso-8859-1
Mark,
As you were explaining your general analytic framework, I
was thinking, too, about holding the necessary data in memory
via SASFILE. That is an underutilized feature of SAS, IMO.
Dale
---------------------------------------
Dale McLerran
Fred Hutchinson Cancer Research Center
mailto: dmclerra@NO_SPAMfhcrc.org
Ph: (206) 667-2926
Fax: (206) 667-5977
---------------------------------------
--- On Sun, 1/24/10, Keintz, H. Mark <mkeintz@WHARTON.UPENN.EDU> wrote:
> From: Keintz, H. Mark <mkeintz@WHARTON.UPENN.EDU>
> Subject: Re: Is Regression Using Proc IML Faster?
> To: SAS-L@LISTSERV.UGA.EDU
> Date: Sunday, January 24, 2010, 4:11 PM
> Dale:
>
> Yes, I did exclude the time it takes to read the data
> into IML, but not the I/O time for a PROC REG, in my
> answer to this particular question.
>
> And if one is just running a single regression, and nothing
> more, then I would actually expect the PROC to be faster,
> since it wouldn't require the overhead of getting the
> entire dataset into memory.
>
> I should have explained that I was viewing IML as an
> interactive environment well-suited for data exploration,
> in which the matrix is brought into memory, and then
> subjected to a number of analyses, one (or many) of which
> would be one particular regression. In that case, the
> marginal cost of the IML regression doesn't involve
> (re-)reading the dataset from disk, but running a PROC
> would, which is why I infer that IML would be faster. And
> no, I don't imagine that IML would be any quicker in
> inverting an SSCP matrix.
>
> Once one has done the exploratory work, then yes by all
> means run the stat PROCs which probably have a bit more
> accuracy, some additional test statistics, and a better
> report format for the results.
>
> Of course, now that I think more carefully about it, one
> would presumably get the same disk input savings by holding
> the dataset in memory via the SASFILE statement, and
> running a suite of PROCs against it.
>
> Thanks for your comments. I was likely reading more into
> the OP's question than was intended.
>
> Regards,
> Mark
>
> > -----Original Message-----
> > From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU]
> On Behalf Of
> > Dale McLerran
> > Sent: Sunday, January 24, 2010 3:35 PM
> > To: SAS-L@LISTSERV.UGA.EDU
> > Subject: Re: Is Regression Using Proc IML Faster?
> >
> > Mark,
> >
> > You would exclude the time that it takes to read the data from
> > disk into an IML matrix as part of the time that is required
> > for performing the regression using IML? Why? Certainly,
> > when you look at the CPU and total time summaries that are
> > produced by IML, those times would include the time that it
> > takes to read the data.
> >
> > I really doubt that fitting regression models employing IML
> > would save much time. I imagine that SAS has optimized many
> > aspects of fitting a regression model. These aspects would
> > include some features that optimize efficiency. Other
> > efficiencies would improve accuracy of the regression
> > results.
> >
> > From my perspective, I would not fit simple regression models
> > using IML in an effort to shave time from fitting the
> > regression. If one wants to study the equations which are
> > used to fit a regression, then using IML has value. But
> > for production work in fitting simple regression models,
> > I would not use IML.
> >
> > Dale
> >
> > ---------------------------------------
> > Dale McLerran
> > Fred Hutchinson Cancer Research Center
> > mailto: dmclerra@NO_SPAMfhcrc.org
> > Ph: (206) 667-2926
> > Fax: (206) 667-5977
> > ---------------------------------------
> >
> >
> > --- On Sun, 1/24/10, Keintz, H. Mark <mkeintz@WHARTON.UPENN.EDU>
> wrote:
> >
> > > From: Keintz, H. Mark <mkeintz@WHARTON.UPENN.EDU>
> > > Subject: Re: Is Regression Using Proc IML
> Faster?
> > > To: SAS-L@LISTSERV.UGA.EDU
> > > Date: Sunday, January 24, 2010, 12:00 PM
> > > Ceteris paribus, IML regression
> > > SHOULD be a bit faster, since the data are
> already in
> > > memory. But I doubt this advantage wold
> hold up with a
> > > large dataset, or being run on a server on which
> your
> > > program is competing for memory and other
> resources.
> > >
> > > Regards,
> > > Mark
> > >
>
|