| Date: | Fri, 2 Mar 2001 20:40:56 GMT |
| Reply-To: | DJNordlund <djnordlund@AOL.COM> |
| Sender: | "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU> |
| From: | DJNordlund <djnordlund@AOL.COM> |
| Organization: | AOL http://www.aol.com |
| Subject: | Re: Save Space... |
|---|
Ed Heaton wrote:
> I think it is worth mentioning that SAS numbers are ALWAYS floating-point
>numbers (never type integer). So, SAS uses 1 bit of it's allocation for the
>sign flag (1 for negative numbers) and 10 bits for the exponent. So, the
>largest integer number that can be stored in 3 bytes, without loosing
>precision, is 2**13 = 8192 rather than 2**23 -1 = 8,388,607.
> If a larger number is stored, then the value of exponent is increased,
>but the number of significant digits (size of the mantissa) remains the
>same. This reduces overflow problems; it takes a very big number indeed to
>max-out the 8-byte floating point number (somewhere approaching 2 *
>10**308). But that number will be able to hold only about 16 decimal digits
>of precision.
>
>Hope this helps a little,
>Ed
>
>Edward Heaton, SAS Senior Statistical Systems Analyst,
>Westat (An Employee-Owned Research Corporation),
>1550 Research Boulevard, Room 2018, Rockville, MD 20850-3195
>Voice: (301) 610-4818 Fax: (301) 294-3992
>mailto:EdwardHeaton@westat.com http://www.westat.com
>
Ed makes a good point about the the floating point representation of SAS
numeric variables. One minor point of clarification. On MS Windows and Unix
(I believe) systems, SAS uses an IEEE 8-byte format which uses 11 bits (not 10)
for the exponent, and one bit for the sign. The remaining 52 bits (of 8-byte
representation) are used for the mantissa. When using LENGTH to reduce the
size of EXTERNAL storage, the 12 bits for exponent and sign are always
maintained; it is the mantissa that is truncated.
With length=3 storage, there are 12 bits available for the mantissa. However,
SAS adjusts the exponent so that the most significant bit of the mantissa is 1.
Since it is always a 1, it does not need to be represented in the remaining 12
bits; you get one bit for free. So as Ed stated above you end up with 13 bits
precision for the stored representation.
Two final points:
1. The reduction in precision only occurs when the variables are saved to
external storage, as SAS uses 8 bytes internally for all numeric variables,
regardless of storage length. (So you will only see the reduced precision if
the data is stored, then read back in.)
2. The number of bits used for exponent and mantissa, varies across platforms.
On big iron, fewer bits are used for the exponent and more for the mantissa
than on MS Windows and Unix systems (check the SAS companion manual for your
operating system). This is important for those who work across platforms.
I haven't worked with mainframe data recently, but I was bitten once by
downloading a mainframe dataset that had a numeric stored as length 2 which was
adequate precision on the mainframe. The minimum length you can create on
Windows is length 3, but the variable was imported without complaint as length
2, and I had lost precision and was left with bad data. I had to resize the
variable on the mainframe and then download it again. This was on SAS 6.11. I
haven't checked if this gotcha has been resolved in later versions. So let the
analyst beware.
Happy SASing,
Dan Nordlund
|