Date: Mon, 24 Nov 1997 11:45:53 -0500
Reply-To: Mark DeHaan <msd@INEL.GOV>
Sender: "SAS(r) Discussion" <SAS-L@UGA.CC.UGA.EDU>
From: Mark DeHaan <msd@INEL.GOV>
Subject: SAS and parallel processing
Content-Type: text/plain; charset="us-ascii"
On Wednesday, November 19, 1997 10:47 AM, Bill Shannon
[SMTP:shannon@OSLER.WUSTL.EDU] wrote:
>> Is there a way to partition sas processes to run on two cpu's in a sun
>> enterprise 3000 with 2 cpu's? Is it called multithreading? How do I make
>> this happen?
>>
>> Thanks,
>> Bill
>>
>>
>> --
>> William D. Shannon, Ph.D.
>> Assistant Professor of Biostatistics in Medicine
>> Washington University School of Medicine
>> Division of General Medical Sciences
>> Campus Box 8005, 660 S. Euclid
>> St. Louis, MO 63110
>>
>> Phone: 314-454-8356
>> Fax: 314-454-5113
>> e-mail: shannon@osler.wustl.edu
>> web page: http://osler.wustl.edu/~shannon
>Your Sun may handle some load balancing of and by itself. Current
>versions of SAS do not support multi-threaded applications, though it is
>conceivable to create a job which spawns off additional processes to run
>concurrently, and waits for their results. Coordination and control of
>this is difficult.
>
>I believe the new SMMP product from SAS is designed to address this issue,
>though I'm not familiar with the details.
>
>Karsten M. Self (kmself@ix.netcom.com)
-------------------------------------------
Bill (and others interested),
I have spent considerable effort looking into this issue myself. It seems
that parallel processing is still neonatal for off-the-shelf type
applications. Understandably, for something as complex as SAS to run
everything it does in a parallel manner would take extensive recoding. In
talking with SAS I found they are working quite actively towards parallel
applications for some components, but a product is still quite a ways off
(years probably).
If you are only doing queries, then SAS has a product (still in Beta I
think) called "Scalable Performance Data Server" (SPDS) that as I understand
it, can take your query and spawn an equivalent query thread on each of your
available processors where each thread works on a different partition of
your data set. Then some back-end combining of the separate thread results
is performed so it appears as if it was a single query. But SPDS only works
on query type tasks (including sorts) and is most helpful for very large
databases and related I/O bottlenecks.
There are a couple of third party products that SAS has looked into, but
they have not seemed to promising for my purposes (intensive statistical
analyses of very large data bases). Instead I have had to go more towards
the Data Mining approach and I am looking more into SAS's Enterprise Miner
(now in alpha!). But still my extra processors go begging.
Hope this helps,
Mark
PS: I have some info on this from SAS that I can fax anyone interested.
-----------------------------------------------------------------------------
Mark DeHaan Idaho National Engineering and Environmental Laboratory
Lockheed-Martin
email: msd@inel.gov
phone: 208-526-2983 fax: 208-526-5647