LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (January 2010)Back to main SPSSX-L pageJoin or leave SPSSX-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Fri, 29 Jan 2010 08:48:48 -0700
Reply-To:     Jon K Peck <peck@us.ibm.com>
Sender:       "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From:         Jon K Peck <peck@us.ibm.com>
Subject:      Re: Using a Ram Disk
Comments: To: David Futrell <dfutrell62@yahoo.com>
In-Reply-To:  <901231.90078.qm@web53904.mail.re2.yahoo.com>
Content-Type: multipart/alternative;

The first thing to consider is that Windows itself caches a lot of things in memory already. It tends to use unallocated memory for extra i/o buffers and to keep loaded modules around in memory as long as the memory isn't needed for something else. So I doubt that there are many situations where a RAM disk would help. A RAM disk would partition off some of the memory specifically for file contents, but the tradeoff is that that memory would not be available for other purposes, so it might induce more paging to disk in other areas.

I haven't used a RAM myself in quite a few years, but my guess is that it is no help in most situations.

There are two things that might help. First, a 64-bit OS with 64-bit SPSS would allow addressing more memory and, with more physical memory, could speed up processing.

Second, a rewrite of the job using Python programmability could probably eliminate most of the data passes and build the hierarchical relationships much more efficiently. 40,000 cases is not a lot of data, so it's likely that all the hierarchy traversals could be built in memory.

So the tradeoffs are throw more hardware at the problem or throw more programming resources at it.

HTH, Jon Peck SPSS, an IBM Company peck@us.ibm.com 312-651-3435

From: David Futrell <dfutrell62@yahoo.com> To: SPSSX-L@LISTSERV.UGA.EDU Date: 01/29/2010 07:44 AM Subject: [SPSSX-L] Using a Ram Disk Sent by: "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>

Fellow Listers:

I have a syntax file that I run frequently that reads an employee database (40,000 records), looks at the employee to supervisor relationship, and ultimitately creates an hierarchy such that the resulting file contains every supervisor above a certain level in the organization and everyone who reports to him and anyone else below him in the organization.

The program works fine, but it's a brute-force method that requires re-reading the same file hundreds of times and creating many hundreds of interim files along the way. It takes about 25 minutes to run.

I thought that I could speed this up significantly by using a RAM disk, so I purchased a software package to do this and modified the syntax so that all of the files being read and written were on the RAM drive.

Unfortunately, although the RAM disk appears to work fine, it doesn't really speed up the processing much and it looks like SPSS is still accessing the hard drive frequently during the program execution.

I'd like some advice from someone who understands the inner workings of SPSS and how I might this problem and get the program to do all this work without accessing the hard drive.

Thanks,

David Futrell Workforce Research Consultant Eli Lilly and Company


[text/html]


Back to: Top of message | Previous page | Main SPSSX-L page