LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (April 2000, week 4)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Fri, 28 Apr 2000 17:56:40 GMT
Reply-To:     sashole@mediaone.net
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Paul Dorfman <paul_dorfman@HOTMAIL.COM>
Subject:      Re: selecting values into a set of macro variables where the
              number of distinct values is unknown
Comments: To: Richard.M.Walker@BRITISHAIRWAYS.COM
Content-Type: text/plain; format=flowed

Richard,

As Bernard has already pointed out, SQL only enables one to create a concatenated list of distinct values separated by a specified character (Bernard likes the comma, but another character might be used if the column values contain commas themselves; I would say, the zero hex '00'x would be a safer bet). Although SQL seems to be the perfect tool for the task - for it makes both the unduplication and concatenation seamless - there are a couple of drawbacks. First, the method will apparently fail if the length of the concatenated string exceeds the limit imposed on the length of a macro variable (say 32767). Second, a macro is needed to parse the SQL output. The DATA step code below solves the problem in a single step assuming that both the table the column are called A, the column is character and LE 8 bytes long. If it is GT 8, increase the declared array item length accordingly.

%let mvprefix = m;

data _null_; array h (0:35023) $8 _temporary_; retain h null '00'x; set a end=eof; do x=mod(input(a,pib6.),35023) by 1 until (h(x)=null or h(x)=a); if x > 35023 then x = 0; end; if h(x) = null then h(x) = a; if eof then do x=0 to 35023; if h(x) = null then continue; n ++ 1; call symput ("&mvprefix"||left(put(n,6.)),h(x)); end; run;

This will create a set of macro variables m1, m2,... populated with the distinct values of A. As you can see, I have already cannimalized on the discovery made yesterday by Jim Groeneveld to initialize the temporary array to low values without looping through it in the same RETAIN statement where the comparison null variable is valued. If nothing else, it saves full 3 lines of code! The size of the array H (which, as everyone has undoubtedly already guessed, is nothing else but the most primitive, i.e. linearly probed, hash table) is determined _not_ by the size of the file A, but by the number of distinct values of the variable A. Since I assume that in practice, nobody would venture to create more than 30,000 such macro variables for whatever purpose, I have chosen the prime upper bound to be slightly larger. If my assumption is wrong, then memory permitting, you can make the array as large as you wish; the only requirement being that the upper bound is prime. Determining the lowest prime greater than a specified number by hand may be tedious ;-), and the corresponding prime table may not be readily available, so it is best to let SAS do the job, too:

*** finding first prime GT smth ***; %let smth = 35000; data _null_; prime = &smth; do until (j = u + 1); prime ++ 1; u = ceil(sqrt(prime)); do j=2 to u until (mod(prime,j) eq 0); end; end; put prime=; run;

Of course, PRIME can be stored in a macro variable to size the array dynamically. The only remaining question is, what if the variable A is numeric? In this case, the code should be slightly modified, yet in actuality it becomes simpler and faster:

data _null_; array h (0:35023) _temporary_; set a end=eof; do x=mod(a,35023) by 1 until (h(x)=. or h(x)=a); if x > 35023 then x = 0; end; if h(x) = . then h(x) = a; if eof then do x=0 to 35023; if h(x) = . then continue; n ++ 1; call symput ("&mvprefix"||left(put(n,6.)),put(h(x),best.)); end; run;

Kind regards, ===================== Paul M. Dorfman Jacksonville, Fl =====================

>From: "Walker, Richard M" <Richard.M.Walker@BRITISHAIRWAYS.COM> >Reply-To: "Walker, Richard M" <Richard.M.Walker@BRITISHAIRWAYS.COM> >To: SAS-L@LISTSERV.UGA.EDU >Subject: selecting values into a set of macro variables where the number > of distinct values is unknown >Date: Fri, 28 Apr 2000 13:11:15 Z > >---------------------- Forwarded by Richard M Walker/HEATHROW/BRITISH >AIRWAYS/GB on 28/04/2000 14:08 --------------------------- > > Richard M Walker > 28/04/2000 12:25:42 > > >Tel: 0208-56-28248 >Mobile: 0385-934634 >OceanWave SAS Developer >TBC 10 South Wing Bay 4B - S120 > >To: sas-l >cc: >bcc: >Subject: selecting values into a set of macro variables where the number of >distinct values is unknown > >Help, > >I recall seeing an e-mail sometime ago that indicated that using proc sql >it is possible to select all the distinct values of a column into a set of >macro variables. > >Can anyone show me an example of this technique? > >Regards >Richard

________________________________________________________________________ Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com


Back to: Top of message | Previous page | Main SAS-L page