Date: Fri, 5 Nov 2010 13:20:31 +0000
Reply-To: "Fehd, Ronald J. (CDC/OCOO/ITSO)" <rjf2@CDC.GOV>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "Fehd, Ronald J. (CDC/OCOO/ITSO)" <rjf2@CDC.GOV>
Subject: Re: Bit type
In-Reply-To: <AANLkTi=H687GVWL9d=p48SVkTVNzrOgD=XmdD9Cu9i5H@mail.gmail.com>
Content-Type: text/plain; charset="us-ascii"
Efficient Storage and Processing of Sequential Indicators
Using SAS(r) Bitwise Functions
http://www.lexjansen.com/pharmasug/2007/pr/pr01.pdf
has a comparison of storing and processing bits or characters
keyword: bitmask
> -----Original Message-----
> From: owner-sas-l@listserv.uga.edu [mailto:owner-sas-
> l@listserv.uga.edu] On Behalf Of Sterling Paramore
> Sent: Thursday, November 04, 2010 5:10 PM
> To: SAS-L@listserv.uga.edu
> Subject: Re: Bit type
>
> They are character - "Y" or "N" - and require 8 bits to store. If
> there was
> a bit field, I would only need 1 bit.
>
> Here's my math:
> 43x10^6 records * 12 fields/record * 8 bits/field = 4x10^9
> 43x10^6 records * 12 fields/record * 1 bits/field = 0.5x10^9
>
>
> On Thu, Nov 4, 2010 at 2:03 PM, Nordlund, Dan (DSHS/RDA) <
> NordlDJ@dshs.wa.gov> wrote:
>
> > Sounds like the best solution is to convert from 8-byte numeric to 3-
> byte
> > numeric representation, or to single character variables. The 3-byte
> > representation would reduce storage requirements to approx 1.5 GB. A
> single
> > character per question would reduce the storage to the 0.5 GB that
> you were
> > looking for. Is there some reason character variables would not
> work?
> >
> > Dan
> >
> > Daniel J. Nordlund
> > Washington State Department of Social and Health Services
> > Planning, Performance, and Accountability
> > Research and Data Analysis Division
> > Olympia, WA 98504-5204
> >
> > > -----Original Message-----
> > > From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf
> Of
> > > Sterling Paramore
> > > Sent: Thursday, November 04, 2010 1:29 PM
> > > To: SAS-L@LISTSERV.UGA.EDU
> > > Subject: Re: Bit type
> > >
> > > It would be worth the disk space savings but it would not be worth
> the
> > > overhead of dealing with bitwise functions or explaining binary to
> my
> > > end
> > > users.
> > >
> > > On Thu, Nov 4, 2010 at 1:22 PM, Keintz, H. Mark
> > > <mkeintz@wharton.upenn.edu>wrote:
> > >
> > > > So, in the absence of a bit variable type, would it be worth the
> disk
> > > space
> > > > savings to build a character variable with 8 bits of information
> per
> > > byte,
> > > > using the bit "wise" functions?
> > > >
> > > > Regards,
> > > > Mark
> > > >
> > > > PS: Actually, I crave integer types far more than bit types.
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On
> Behalf
> > > Of
> > > > > Sterling Paramore
> > > > > Sent: Thursday, November 04, 2010 3:31 PM
> > > > > To: SAS-L@LISTSERV.UGA.EDU
> > > > > Subject: Re: Bit type
> > > > >
> > > > > I want a bit variable mostly to reduce the size of datasets
> > > (thereby
> > > > > speeding up sorts). I've got 12 Y/N columns in a dataset
> that's 43
> > > > > million
> > > > > lines long requiring (uncompressed) over 4 GB of space just for
> > > those
> > > > > columns. If I could use a bit variable, it would only require
> 0.5
> > > GB.
> > > > >
> > > > > On Thu, Nov 4, 2010 at 11:11 AM, Joe Matise
> <snoopy369@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Don't know the answer specifically, but I'd say generally
> that it
> > > > > wouldn't
> > > > > > be useful - SAS is primarily a database software, after all,
> and
> > > you
> > > > > aren't
> > > > > > going to be storing individual bits in datasets (I'd think).
> > > Even if
> > > > > it had
> > > > > > a bit/boolean/etc. data type that allowed you to store a
> single
> > > 1/0,
> > > > > how
> > > > > > would you place that in data storage? Probably in a whole
> byte,
> > > > > unless you
> > > > > > were using a COMPRESS option, in which case SAS as currently
> > > > > implemented
> > > > > > will store that in just a bit (I think).
> > > > > >
> > > > > > SAS isn't going to run out of memory for most operations in
> the
> > > data
> > > > > step
> > > > > > itself, so haivng a bit vs. a byte used to store 1 vs 0 in a
> data
> > > > > step
> > > > > > variable isn't really a big deal, I'd expect. More overhead
> > > would be
> > > > > lost
> > > > > > due to having to deal with potential bits and how you store
> them
> > > than
> > > > > just
> > > > > > assigning the variable a byte and leaving it at that. And if
> you
> > > > > really
> > > > > > want to, you still can use individual bits of course with
> bit-
> > > shift
> > > > > > operators and whatnot.
> > > > > >
> > > > > > -Joe
> > > > > >
> > > > > >
> > > > > > On Thu, Nov 4, 2010 at 1:02 PM, Sterling Paramore
> > > > > <gnilrets@gmail.com>wrote:
> > > > > >
> > > > > >> Dear SAS-L,
> > > > > >>
> > > > > >> Just out of curiosity, does anyone know why SAS doesn't have
> a
> > > bit
> > > > > >> variable
> > > > > >> type?
> > > > > >>
> > > > > >> -Sterling
> > > > > >>
> > > > > >
> > > > > >
> > > >
> >
|