Date: Mon, 28 Oct 2002 13:42:55 -0500
Reply-To: Francis Harvey <HARVEYF1@WESTAT.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Francis Harvey <HARVEYF1@WESTAT.COM>
Subject: Re: Bit arrays
Content-Type: text/plain; charset="iso-8859-1"
Greetings Paul,
Taking advantage of the shadow bytes in 8.2, I then got the code below.
Aside from the first 104 byte price tag for all arrays, I seem to lose at
most 7 bytes with any particular array. For a really large bit array, this
would appear to have an advantage over losing 8 to 11 bits for every member
of a numeric array. I will also be examining using a large character
variable
as an array as well as recoding this example using your numeric arrays to
perform speed comparisons, but I wondered if you had any initial impressions
on this method.
%let arrBits = 833;
%let arrBytes = %eval((&arrBits - 1) / 8 + 1);
%let mult = %sysfunc(ceil(%sysfunc(sign(&arrBytes - 104)) / 2));
%let arrElements = %eval(((((&arrBytes - 105) / 8) + 1) * 8 * &mult) + 1);
data _null_;
array bitArr{&arrElements} $ 1 _temporary_;
/* Create pointer to array */
arrAddr = addr(bitArr{1});
/* Initialize all bits to 0 */
call poke(repeat("00"x,&actArrBytes - 1),arrAddr,&actArrBytes);
/* Set bit 1 */
bitSet = 1;
/* Find out what byte this is in */
byteOffset = int((bitSet - 1) / 8);
bitPattern = peek(arrAddr + byteOffset,1);
bitPattern = bor(bitPattern,2 ** mod(bitSet - 1,8));
call poke(put(bitPattern,ib8.),arrAddr + byteOffset,1);
/* Set bit 833 */
bitSet = 833;
/* Find out what byte this is in */
byteOffset = int((bitSet - 1) / 8);
bitPattern = peek(arrAddr + byteOffset,1);
bitPattern = bor(bitPattern,2 ** mod(bitSet - 1,8));
call poke(put(bitPattern,ib8.),arrAddr + byteOffset,1);
/* Find bits that have been set */
do i = 1 to &arrBits;
bitGet = i;
/* Find out what byte this is in */
byteOffset = int((bitGet - 1) / 8);
bitPattern = peek(arrAddr + byteOffset,1);
test = 1 and band(bitPattern ,2 ** mod(bitGet - 1,8));
if test = 1 then do;
put bitGet;
end;
end;
run;
Francis R. Harvey III
WB303, x3952
harveyf1@westat.com
VB programmers know the wisdom of Nothing
> -----Original Message-----
> From: Paul Dorfman [mailto:paul_dorfman@hotmail.com]
> Sent: Monday, June 03, 2002 10:57 PM
> To: Francis Harvey; SAS-L@LISTSERV.UGA.EDU
> Subject: Re: Bit arrays
>
>
> Francis,
>
> Unfortunately, in 8.2 the situation got fixed only partially.
> As you know,
> before 8.2, if a temporary character array were declared as
>
> array a (1000000) $1. _temporary_ ;
>
> if would allocate 8 bytes per item, leaving the "undeclared" bytes as
> "shadow bytes". That meant that the array elements would have
> the expression
> length of 1, yet the memory length would by 8, proven by the
> fact that the
> addresses of the array items would be spaced 8 bytes apart.
>
> You of course understand my enthusiasm when I learned that
> the Institute had
> addressed my concern and made the items of the array A
> adjacent in memory
> spaced just 1 byte apart, as it should be. However, the joy
> was short-lived.
> Yes, the Data step now spaces the addresses of array elements
> exactly as far
> apart in memory as indicated by the declared length. And yes,
> just as it was
> before, the expression length corresponds to the declared
> one. However, in a
> really bizarre twist, it still allocated 8 byte of real
> memory per item. If
> you find it hard to believe, here is a proof (SAS V9.0, Windows XP,
> irrelevant notes killed):
>
> 13 data _null_;
> 14 run ;
>
> NOTE: DATA statement used (Total process time):
> Memory 89k
> 15
> 16 data _null_;
> 17 array a (0 : 999999) $1. _temporary_ ;
> 18 addr0 = addr(a(0)) ;
> 19 addr1 = addr(a(1)) ;
> 20 addr2 = addr(a(2)) ;
> 21 put addr0--addr2 ;
> 22 run ;
>
> 74978304 74978305 74978306
> NOTE: DATA statement used (Total process time):
> Memory 8868k
>
> I am sorry, but if you are planning on bitmapping a
> reasonable range without
> running out of memory, you may want to "adopt the paper's
> approach", no
> matter how hideous it may appear.
>
> Kind regrets,
> ==================
> Paul M. Dorfman
> Jacksonville, Fl
> ==================
>
>
>
>
>
> ----Original Message Follows----
> From: Francis Harvey <HARVEYF1@WESTAT.COM>
>
> Greetings Ken;
>
> Unfortunately, this still leaves me with the same quandary as the
> paper mentioned, just because 8.2 allows me to have a one
> character byte array does not mean it takes advantage of the
> resulting reduced space, and I have no mechanism for evaluating
> it. I see some improvements to my code that I could use, but I
> need to know if my mechanism is unsound or inefficient before I
> adopt the paper's approach. I wonder if there is an update?
>
> Francis R. Harvey III
> WB303, x3952
> harveyf1@westat.com
>
> VB programmers know the wisdom of Nothing
>
> > -----Original Message-----
> > From: Kenneth Moody [mailto:KennethMoody@FIRSTHEALTH.COM]
> > Sent: Monday, June 03, 2002 1:37 PM
> > To: SAS-L@LISTSERV.VT.EDU
> > Subject: Re: Bit arrays
> >
> >
> > An excellent reference is Paul Dorfman's SUGI 26 paper, Table
> > Look-Up by
> > Direct Addressing: Key-Indexing -- Bitmapping -- Hashing.
> >
> > You can find it at:
> >
> > http://www2.sas.com/proceedings/sugi26/p008-26.pdf
> >
> >
> > Ken Moody
> > First Health, Metrics Department
> > Voice: 916-374-3924
> > EMail: KennethMoody@firsthealth.com
> <snip>
>
>
> _________________________________________________________________
> Send and receive Hotmail on your mobile device: http://mobile.msn.com
>
|