Date: Fri, 26 May 2000 18:40:11 +0200
Reply-To: "Hellriegel, Gerhard" <Gerhard.Hellriegel@TELEKOM.DE>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "Hellriegel, Gerhard" <Gerhard.Hellriegel@TELEKOM.DE>
Subject: Re: problem sorting array
Content-Type: text/plain; charset="iso-8859-1"
Hi,
you are not right by just multiplying the time by 100! What you use is a bubble-sort, which is ok with a small amount of records. It is not efficient with many records. You should choose a kind of quicksort or merge-sort.
If you have a look at the loops, you'll see that with 10 records you have 10*10 iterations (round about) = 100
With 1000 you have 1000*1000=1000000. With 100000 you'll have 100000*100000=10,000,000,000,...
If you have 0.7 sec for 1,000,000 you perhaps need 0.7 * 10,000 = 7000 sec
If you have the variables in a pre-sorted order, maybe you can improve it with a test of the sort order after each iteration. In many cases you can stop iterating before it ends anyway. A good test is: if in one iteration of the outer loop (one time through all data) is no element which was switched, the data is already in sorted order.
Maybe you should use proc transpose to make records out of your variables, sort them by SAS and use proc transpose again.
----------------------------------------------
Gerhard Hellriegel
DeTeCSM / SLM
Im Leuschnerpark 4
64347 Griesheim
Tel.: +49 6151 818 9806
Fax: +49 6151 818 9611
Mobil: 0171 2263497
e-mail: gerhard.hellriegel@telekom.de
privat: ghellrieg@t-online.de
----------------------------------------------
> -----Urspr> üngliche Nachricht-----
> Von: bolvandubina@NETSCAPE.NET [SMTP:bolvandubina@NETSCAPE.NET]
> Gesendet am: Freitag, 26. Mai 2000 17:36
> An: SAS-L@LISTSERV.UGA.EDU
> Betreff: problem sorting array
>
> Hello sas-l:
>
> I run SAS V8 on MVS. I needed to sort an array in a datastep and got code from our lead SAS expert programmer she said was supposed to be most efficient. And indeed when I tried it on an array with 1000 buckets it worked very fast (.71 cpu sec). But when I try it on a real array (100000) the program seems to be looping. it seems that theoretically it should have finished in 71 cpu sec, right? But I had to cancel the job after it had run for over 710 cpu sec. If the code doesn't loop with 1000 buckets how can it loop with 100000? And plus must not both DO-loops below must stop when both I and J reach dim(a)?
>
> data _null_;
> array a(1000) _temporary_;
> do i=1 to dim(a);
> a(i) = ranuni(12345);
> end;
> do j=1 to dim(a);
> do i=2 to dim(a);
> if a(i-1) > a(i) then
> do;
> t = a(i-1);
> a(i-1) = a(i);
> a(i) = t;
> end;
> end;
> end;
> run;
>
> I appreciate any clarification of what I must be missing here.
>
> TIA, Bolvan
>
>
>
>
>
> ----------
> Get your own FREE, personal Netscape Webmail account today at http://home.netscape.com/webmail/
|