Date: Wed, 21 Nov 2007 16:23:37 +0000
Reply-To: iw1junk@COMCAST.NET
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Ian Whitlock <iw1junk@COMCAST.NET>
Subject: Re: informats reading NUMERIC data
Summary: On the character representation of numbers.
#iw-value=1
Roland,
Let's take seriously your idea that packed decimal is a number in SAS
and study the following log.
1 data w ;
2 x = 128 ;
3 y = put (x, pd4.) ;
4 z = input(y, pd4.) ;
5 if x = y then put "X and Y are the same number" ;
6 format y $hex8. ;
7 run ;
NOTE: Character values have been converted to numeric values at the
places given by:
(Line):(Column).
5:14
NOTE: Invalid numeric data, y='...(' , at line 5 column 14.
x=128 y=00000128 z=128 _ERROR_=1 _N_=1
The first note explains that the comparison of X and Y requires the
conversion of character data to numeric. The variable causing the
trouble is Y and it has the form of packed decimal, but the second
note says Y is invalid numeric data. Clearly Y is character data as
far as SAS is concerned.
Yes Y is character data representing a number, just as the characters
128 represent a number, which is a power of two when interpreted
correctly. This can be seen from looking at the value of Z. It was
created with the appropriate NUMERIC informat for reading character
data and transforming it into numeric data.
You might note the format $HEX8. applied to Y. If Y is a number why
is a character format used to see it? When you look the value of Y in
the last line of the log you see digits, but these digits were not
stored as you can see from the last NOTE. In fact only the last byte
of Y is a printable character. The first three bytes are unprintable
characters.
SAS calls an INFORMAT NUMERIC when it transforms the data read into a
number, i.e. a stored floating point form. SAS calls a FORMAT NUMERIC
when it converts a stored floating point number into character data.
So PD4. is the name of a numeric informat and also the name of a
numeric format. One reads character data and one writes character
data.
Let's consider your statement,
"I would argue that there are a number of informats that read
NUMERIC data."
No they are called numeric informats becasue they read character data
that can be interpreted as numbers. Now let's ask, is Roland a
number? Here is the log of a DATA step to consider the point.
70 data _null_ ;
71 input x numeric_informat. ;
72 list ;
73 vtype_x = vtype(x) ;
74 put x= vtype_x= ;
75 cards ;
x=-1 vtype_x=N
RULE: ----+----1----+----2----+----3----+----4----+----5
76 Roland
x=6 vtype_x=N
77 6
If numeric informats read numbers as you claim, then we must
accept that Roland is a number and not a string of characters.
Perhaps it is wiser to believe you are wrong, and that a numeric
informat can convert the characters, Roland, into a negative number.
Ian Whitlock
===============
Date: Wed, 21 Nov 2007 05:08:30 -0800
Reply-To: RolandRB <rolandberry@HOTMAIL.COM>
Sender: "SAS(r) Discussion"
From: RolandRB <rolandberry@HOTMAIL.COM>
Organization: http://groups.google.com
Subject: informats reading NUMERIC data
Comments: To: sas-l
Content-Type: text/plain; charset=ISO-8859-1
I think there is a profound misunderstanding about the nature of
informats when I see a statement like this:
"Now to be accurate, as the more informed members of SAS-L have said,
informats read character data and transform it to either numeric or
character."
I would argue that there are a number of informats that read NUMERIC
data. Firstly I will make it clear that I regard the main purpose of
informats are to read raw data. True, I use them as others do within
data steps, to map values to a numeric field that I will use for
sorting but I would avoid doing data conversion within a data step and
instead apply the informats as raw data is read in. Now it depends on
what you call "character data" in the quote above. A packed decimal or
an integer stored in say Cobol format is a sequence of characters so
could perhaps be referred to as "character data" but I would not chose
to do this myself. For me it is still "numeric data". To me, even
unpacked or display numerics, defined to a numeric field and used for
numeric calculation purposes, is "numeric data" though this is more
arguable.
There is a list of informats on the following page:
http://www.caspur.it/risorse/softappl/doc/sas_docs/lgref/z1239776.htm
Near the bottom of the page it lists the numeric informats. You will
notice that some of these refer to "packed" and "binary" values as
well as floating point numbers created by particular language
implementations. Look above and some of the Date and Time formats
mention the word "packed". Surely these are "numeric data" and not
"character data"?