LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (July 2008, week 3)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Wed, 16 Jul 2008 13:01:12 -0700
Reply-To:   "Nordlund, Dan (DSHS/RDA)" <NordlDJ@DSHS.WA.GOV>
Sender:   "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:   "Nordlund, Dan (DSHS/RDA)" <NordlDJ@DSHS.WA.GOV>
Subject:   Re: Informat, Format, & Length Statements
In-Reply-To:   <200807161824.m6GHCKC0013416@malibu.cc.uga.edu>
Content-Type:   text/plain; charset=iso-8859-1

> -----Original Message----- > From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On > Behalf Of Paul Dorfman > Sent: Wednesday, July 16, 2008 11:24 AM > To: SAS-L@LISTSERV.UGA.EDU > Subject: Re: Informat, Format, & Length Statements > > Dan, > > Whether 3 numeric bytes under W/U can store an integer > exactly depends not > only on its absolute value but also on its other properties. > For example, > all *even* integers can be stored in 3 bytes exactly up to > n=16,384; after > that, some even integers can and some cannot. As to the odd > integers, the > greatest one stored in 3 bytes precisely is 8191. > > The smallest integer not stored correctly in 3 bytes is thus 8192, so > generally speaking, in this respect SAS9.2 documentation does > hold water. > > On a different note, I find it utterly useless to use numeric > length less > than full 8 bytes. Since its only purpose can be saving disk > space (and > wasting CPU time), other methods exist that allow for much greater > savings. To wit, just 2 character bytes can store an ~8 times greater > integer than 3 numeric bytes, namely, up to 256**3-1=65,535 via simple > > put (n, pib2.) ; > > while 3 bytes can store the whopping 256**3-1=16,777,215 via > > put (n, pib3.) ; > > The added value of this method is the firm knowledge and > predictability fo > the exact integer precision being thusly rendered, without any need to > consult with the manual (or SAS-L, for that matter). To the > objection that > the rb-pib-rb conversion takes time I'd retort that the unavoidable > implicit conversion from 8 to fewer numeric bytes and vice versa takes > time, too. > > Kind regards > ------------ > Paul Dorfman > Jax, FL > ------------ > >

Paul,

I am in agreement with most of what you write above (as usual), especially that it is not particularly useful to define numerics as less than 8 bytes. I did a couple quick tests which suggest to me that you are correct that rb-pib-rb is not much different in terms of total processing time, with the added benefit as you point out of correctly storing results. (I am going to have to think about whether I might be able to use this profitably in my work).

264 run;

NOTE: The data set WORK.TEST2 has 100000000 observations and 2 variables. NOTE: DATA statement used (Total process time): real time 31.71 seconds cpu time 24.82 seconds

265 data test3; 266 do i=1 to 1e8; 267 j=put (i, pib3.) ; 268 output; 269 end; 270 run;

NOTE: The data set WORK.TEST3 has 100000000 observations and 2 variables. NOTE: DATA statement used (Total process time): real time 44.31 seconds cpu time 36.28 seconds

271 data test4; 272 set test2; 273 k=j; 274 run;

NOTE: There were 100000000 observations read from the data set WORK.TEST2. NOTE: The data set WORK.TEST4 has 100000000 observations and 3 variables. NOTE: DATA statement used (Total process time): real time 2:33.06 cpu time 58.62 seconds

275 data test5; 276 set test3; 277 k=input(j,pib3.); 278 run;

NOTE: There were 100000000 observations read from the data set WORK.TEST3. NOTE: The data set WORK.TEST5 has 100000000 observations and 3 variables. NOTE: DATA statement used (Total process time): real time 2:18.45 cpu time 1:00.43

The one small quibble I have is with your statement that "all *even* integers can be stored in 3 bytes exactly up to n=16,384". It is true that if you store the value 8194 in a length 3 numeric, when you read it back you will get 8194. However, when you read a numeric variable defined as length 3 from a file and the value you get is 8194, you don't know if the original value was 8194 or 8195. So I would argue that the value is not stored "exactly". You have lost 1-bit of precision.

Dan

Daniel J. Nordlund Washington State Department of Social and Health Services Planning, Performance, and Accountability Research and Data Analysis Division Olympia, WA 98504-5204


Back to: Top of message | Previous page | Main SAS-L page