|
On Jan 5, 11:23 pm, Nord...@DSHS.WA.GOV ("Nordlund, Dan (DSHS/RDA)")
wrote:
> > -----Original Message-----
> > From: SAS(r) Discussion [mailto:SA...@LISTSERV.UGA.EDU] On Behalf Of
> > droide
> > Sent: Tuesday, December 29, 2009 10:07 PM
> > To: SA...@LISTSERV.UGA.EDU
> > Subject: Re: missing numerical values = - infinity?
>
> > Hi, all,
>
> > Thought this code might be of interest, especially with regard to Dan
> > Nordlund's post with a reference about floating point reps:
>
> > ------------------------------------------- Code
> > -------------------------------------------------
>
> > options noovp noerrorabend nocenter mprint mprintnest ls = 80 ps = 54
> > compress = yes spool nosymbolgen nomlogic source source2;
> > title "SAS v&sysvlong on &sysscpl";
>
> > data missings ( keep = x xhex xaddrl );
> > do ade = rank( '_' ), rank( ' ' ), rank( 'A' ) to rank( 'Z' );
> > x = input( '.' || byte( ade ), ??best. );
> > link output;
> > end;
> > do a = -constant( 'BIG' ), -1, -constant( 'SMALL' ),
> > 0,
> > constant( 'SMALL' ), .1, .2, .3, .4, .5, .6, .7, .8, .9, 1,
> > 2, 5, 9, 10, constant( 'BIG' );
> > x = a;
> > link output;
> > end;
> > stop;
>
> > output:
> > xaddrl = addrlong( x );
> > format xaddrl hex16.;
> > xhex = put( peekclong( xaddrl ), hex16. );
> > output;
> > return;
> > run;
>
> > proc print width = min data = missings;
> > var xhex x;
> > format x best32.;
> > run;
>
> > ---------------------------------- Output
> > ------------------------------------------
>
> > SAS v9.01.01M3P020206 on XP_PRO 01:04 Wednesday, December
> > 30, 2009 1
>
> > Obs xhex x
>
> > 1 0000000000D2FFFF _
> > 2 0000000000D1FFFF .
> > 3 0000000000BEFFFF A
> > 4 0000000000BDFFFF B
> > 5 0000000000BCFFFF C
> > 6 0000000000BBFFFF D
>
> <<<snip>>>
>
> I wasn't going to keep this thread going, but then my name was mentioned, and I couldn't resist. :-) First, since SAS stores numeric values in reverse byte order, I thought it might be useful to alter droide's code to create xhex as a character value (reversing the SAS storage order) so that it is easier to see the value of the sign bit, exponent, and mantissa. See below.
>
> data missings ( keep = x xhex );
> length xhex $16.;
> do ade = rank( '_' ), rank( ' ' ), rank( 'A' ) to rank( 'Z' );
> x = input( '.' || byte( ade ), ??best. );
> link output;
> end;
> do a = -constant( 'BIG' ), -1, -constant( 'SMALL' ), 0,
> constant( 'SMALL' ), .1, .2, .3, .4, .5, .6, .7, .8, .9, 1,
> 2, 5, 9, 10, constant( 'BIG' );
> x = a;
> link output;
> end;
> stop;
>
> output:
> do _n_ = 7 to 0 by -1;
> xhex = cats(xhex, put( peekc( addr(x)+_n_ ), hex2. ));
> end;
> output;
> xhex = '';
> return;
> run;
>
> Obs xhex x
> 1 FFFFD20000000000 _
> 2 FFFFD10000000000 .
> 3 FFFFBE0000000000 A
> 4 FFFFBD0000000000 B
> 5 FFFFBC0000000000 C
> 6 FFFFBB0000000000 D
> 7 FFFFBA0000000000 E
> 8 FFFFB90000000000 F
> 9 FFFFB80000000000 G
> 10 FFFFB70000000000 H
> 11 FFFFB60000000000 I
> 12 FFFFB50000000000 J
> 13 FFFFB40000000000 K
> 14 FFFFB30000000000 L
> 15 FFFFB20000000000 M
> 16 FFFFB10000000000 N
> 17 FFFFB00000000000 O
> 18 FFFFAF0000000000 P
> 19 FFFFAE0000000000 Q
> 20 FFFFAD0000000000 R
> 21 FFFFAC0000000000 S
> 22 FFFFAB0000000000 T
> 23 FFFFAA0000000000 U
> 24 FFFFA90000000000 V
> 25 FFFFA80000000000 W
> 26 FFFFA70000000000 X
> 27 FFFFA60000000000 Y
> 28 FFFFA50000000000 Z
> 29 FFEFFFFFFFFFFFFF -1.7976931348623E308
> 30 BFF0000000000000 -1
> 31 8010000000000000 -2.2250738585072E-308
> 32 0000000000000000 0
> 33 0010000000000000 2.2250738585072E-308
> 34 3FB999999999999A 0.1
> 35 3FC999999999999A 0.2
> 36 3FD3333333333333 0.3
> 37 3FD999999999999A 0.4
> 38 3FE0000000000000 0.5
> 39 3FE3333333333333 0.6
> 40 3FE6666666666666 0.7
> 41 3FE999999999999A 0.8
> 42 3FECCCCCCCCCCCCD 0.9
> 43 3FF0000000000000 1
> 44 4000000000000000 2
> 45 4014000000000000 5
> 46 4022000000000000 9
> 47 4024000000000000 10
> 48 7FEFFFFFFFFFFFFF 1.7976931348623E308
>
> The first 3 hex digits are made up from the sign bit (left most or most significant bit) followed by an 11-bit exponent (any number where the first hex digit is 8 or larger has a negative sign bit).
>
> As can be seen from obs 48, the largest valid positive number has a biased exponent of 7FE hex (decimal 2046), or unbiased (subtract 1023) decimal 1023. (The smallest biased exponent is 001 hex, or unbiased decimal -1022). The largest and smallest exponents 7FF hex and 000 hex, respectively, are reserved for special purposes in the IEEE standard. An exponent of 0 with a mantissa of zero is used to represent the number 0 (and it is possible have a positive and a negative zero). An exponent of 7FF hex (decimal 1024) with a zero mantissa is used to represent plus or minus infinity (depending on the sign bit).
>
> In this and other related threads, it has been stated that SAS missing is negative infinity. However, if one looks at hex values for the series of missing values, it can be seen that the exponents are equal to 7FF hex, i.e. equal to the maximum valid exponent plus one. So, all missing values are "smaller" than the smallest valid negative number, but they are neither the smallest "negative numbers" nor are any of them equal to the IEEE value for negative infinity, which is FFF0000000000000 (sign bit 1, exponent 7FF, mantissa 0).
>
> In the related thread "Re: New Comparison Operators? - WAS: missing numerical values = - infinity?", it has been suggested that a 3-valued logic might be "better than" SAS's approach to missing values. I am not sure there is much benefit. For example, consider the following statement in the R language:
>
> if (a==b) TRUE else FALSE
>
> If a equals b then the statement returns TRUE. If neither a nor b are missing (i.e. NA in R speak) and a does not equal b, then the statement returns FALSE. However, if either a or b (or both) are NA, then the statement generates an error message. So, whether in SAS or R, one still needs to test for missing if it is a possibility. One could argue that it is better to error out than it is to return FALSE silently when only one value is missing, and TRUE where both have the same missing value. But I could argue that if 'missing' is a remote possibility, then one ought to be testing for missing either to keep one's program from failing, or to keep from generating a spurious result. In either case it is necessary to know your data, your programming language, and the consequences of using missing values in a comparison.
>
> Hope this adds to the discussion in a useful way,
>
> Dan
>
> Daniel J. Nordlund
> Washington State Department of Social and Health Services
> Planning, Performance, and Accountability
> Research and Data Analysis Division
> Olympia, WA 98504-5204- Hide quoted text -
>
> - Show quoted text -
Hi,
Interesting thread in R on the three headed monster.
It looks like some of the R folks are on the SAS side of things
http://n4.nabble.com/Benefit-of-treating-NA-and-NaN-differently-for-numerics-td991596.html#a991596
http://tiny.cc/TqpOO
Here is an excerpt from the last responder ( this is R speak)
I don't know of any cases where a useful distinction is made between
NA
and NaN, but I suppose it could be useful to know where the bad value
came from. R functions rarely generate NaN directly, it usually
comes
from the hardware or runtime library.
And by the way, as the thread containing this message shows,
http://finzi.psych.upenn.edu/R/R-devel/2009-August/054319.html
there are several different encodings which are displayed as NA, and
a
huge number (more than 2^50, I seem to recall) of different encodings
displayed as NaN.
|