LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (July 1997, week 4)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Sat, 26 Jul 1997 17:36:48 +1000
Reply-To:     Tim CHURCHES <TCHUR@DOH.HEALTH.NSW.GOV.AU>
Sender:       "SAS(r) Discussion" <SAS-L@UGA.CC.UGA.EDU>
From:         Tim CHURCHES <TCHUR@DOH.HEALTH.NSW.GOV.AU>
Subject:      Re: PC vs. Mainframe (Performance)
Content-Type: text/plain

Michelle L Oyen wrote: > > I've searched with a variety of Search Engines, DejaNews, and SAS sites but > couldn't find any references to how a PC performs compared to a Mainframe. > I realize the answer may be very dependent upon the types of processes that > are being run and the types of hardware, etc. but are their any general rules > (such as the PC will never perform as well as a mainframe for large sets of > data (define large), etc? or that the PC can perform as well as the mainframe, > etc....) > > Assuming you could set up an intel PC with any Windows OS and whatever > hardware is required would it be possible to obtain similar* performance? > > 'Similar' in that the programs take no more than 10 times as long (i.e. a > 3 minute mainframe time vs. a max of 30 min PC time). > > FYI: The datasets the SAS program works on range in size from a few MB to > ~600 MB (and maybe very rarely over a GB).

Historically mainframes have had superior performance because of much faster disc input/output (I/O), lots of memory (so that data could be cached or manipulated without having to access any disc drives) and faster CPUs (albeit shared with many users).

Whether these differences still apply depends on what sort of mainframe you have available. If it is a modern one (no more than a few years old) and has been upgraded, then the CPU may be a few times faster than a Pentium Pro PC and it may have many gigabytes of memory. However the big difference will be in the disc drives, of which there will be many, each with its own disc controller and I/O path. Most mainframes usually come with a full-time administrator who usually spends quite a lot of time working out how to distribute data and workspaces across the disc drives to maximise I/O performance.

However, PCs are beginning to rival these features. The CPU speed is not far behind most mainframes and you can have two or four CPUs in your PC if you wish.. Server class PCs can be fitted with lots of memory (1, 2 or even 4 gigabytes is possible and 256 or 512 megabytes is now quite affordable even for a single user workstation) and operating systems such as Windows NT or SCO UnixWare will use all that memory to good advantage. However, care needs to be taken in configuring the PC to ensure that its disc I/O performance is as good as possible, because this is where it will still trail the mainframe. The secret is to use Fast-Wide SCSI disc drives which run at 7200 or 10,000 rpm, with each disc attached to its own SCSI channel. Arrange your SAS data libraries so that you are always reading data from one disc and writing it to another. Avoid using the RAID-5 arrays available on most PCs if you want maximum performance since they tend to attach all the discs in the array to a single SCSI channel (there are some high-end exceptions which use multiple SCSI channels for each RAID array but they cost a lot more).

As an example, we have a dual Pentium Pro server with 512 megabytes of memory configured as I have descibed. Creating a 20% subset of a 700 megabyte SAS dataset (with 2 million observations) takes no more than 60 seconds. Performing summarisation or other analysis on the data as it is subsetted hardly slows things down since the CPU is still spending a lot of time waiting for data from the disc. Smaller datasets of about 25 megabytes and 100,000 observations take about 10 seconds to subset or summarise the first time and typically 2 or 3 seconds thereafter because the entire dataset is cached in memory. Of course, it may be slower if the machine is shared with other users, but since a machine like this can be built for about $20,000, you don't need to share it with too many other people to justify the cost (unlike a mainframe).

Note that the same subsetting task takes about 2.5 to 3 times longer when the data is being both read from and written to the same disc drive. This is the bottleneck in since most PCs since they are configured with a single disc drive. Note that adding a second IDE disc to a PC does not help much - you really need multiple Ultra-Wide SCSI discs on independent SCSI channels (although multiple Fast ATA interface discs might be nearly as fast and a bit cheaper). The speed of the disc is important too: most PC disc drives operate at 4500 or 5400 rpm and can't deliver sequentially read data (as is typical when using SAS) as fast as the disc interface can handle. The more expensive 7200 or 10,000 rpm discs are definitely worth it.

An alternative to a fast PC would be a Unix workstation, but the consensus seems to be that you have to pay about twice as much to get the same level of performance as a PC, provided that you configure the PC correctly for use with SAS (as described above). The cost of SAS licenses is also a consideration: a single user license for SAS running under Windows NT WorkStation on a machine with no more than two CPUs is almost affordable, whereas the cost of SAS for most Unix workstations and mainframes always makes me cringe...

Don't forget to consider the cost of setting up and maintaining a PC. With a mainframe, someone else does this for you and all these costs are hidden. It can take quite a lot of time to administer a PC server if there are more than a few people using it. Security (both logical and physical) needs to be considered - Windows NT can be made very secure but it takes effort. The nice thing about a mainframe is that if something goes wrong it is someone else's problem and you can just grumble, whereas with a PC, if something goes wrong it is usually your problem...

Hope this helps,

Tim Churches


Back to: Top of message | Previous page | Main SAS-L page