Date: Fri, 29 Jan 2010 08:48:48 -0700
Reply-To: Jon K Peck <peck@us.ibm.com>
Sender: "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From: Jon K Peck <peck@us.ibm.com>
Subject: Re: Using a Ram Disk
In-Reply-To: <901231.90078.qm@web53904.mail.re2.yahoo.com>
Content-Type: multipart/alternative;
The first thing to consider is that Windows itself caches a lot of things
in memory already. It tends to use unallocated memory for extra i/o
buffers and to keep loaded modules around in memory as long as the memory
isn't needed for something else. So I doubt that there are many
situations where a RAM disk would help. A RAM disk would partition off
some of the memory specifically for file contents, but the tradeoff is
that that memory would not be available for other purposes, so it might
induce more paging to disk in other areas.
I haven't used a RAM myself in quite a few years, but my guess is that it
is no help in most situations.
There are two things that might help. First, a 64-bit OS with 64-bit SPSS
would allow addressing more memory and, with more physical memory, could
speed up processing.
Second, a rewrite of the job using Python programmability could probably
eliminate most of the data passes and build the hierarchical relationships
much more efficiently. 40,000 cases is not a lot of data, so it's likely
that all the hierarchy traversals could be built in memory.
So the tradeoffs are throw more hardware at the problem or throw more
programming resources at it.
HTH,
Jon Peck
SPSS, an IBM Company
peck@us.ibm.com
312-651-3435
From:
David Futrell <dfutrell62@yahoo.com>
To:
SPSSX-L@LISTSERV.UGA.EDU
Date:
01/29/2010 07:44 AM
Subject:
[SPSSX-L] Using a Ram Disk
Sent by:
"SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
Fellow Listers:
I have a syntax file that I run frequently that reads an employee database
(40,000 records), looks at the employee to supervisor relationship, and
ultimitately creates an hierarchy such that the resulting file contains
every supervisor above a certain level in the organization and everyone
who reports to him and anyone else below him in the organization.
The program works fine, but it's a brute-force method that requires
re-reading the same file hundreds of times and creating many hundreds of
interim files along the way. It takes about 25 minutes to run.
I thought that I could speed this up significantly by using a RAM disk, so
I purchased a software package to do this and modified the syntax so that
all of the files being read and written were on the RAM drive.
Unfortunately, although the RAM disk appears to work fine, it doesn't
really speed up the processing much and it looks like SPSS is still
accessing the hard drive frequently during the program execution.
I'd like some advice from someone who understands the inner workings of
SPSS and how I might this problem and get the program to do all this work
without accessing the hard drive.
Thanks,
David Futrell
Workforce Research Consultant
Eli Lilly and Company
[text/html]