Date: Tue, 2 Jun 1998 13:01:39 -0400
Reply-To: "Hudson, Spencer" <shudson@VIROPHARMA.COM>
Sender: "SAS(r) Discussion" <SAS-L@UGA.CC.UGA.EDU>
From: "Hudson, Spencer" <shudson@VIROPHARMA.COM>
Subject: FW: SAS Trap: ordering of FORMAT= in ATTRIB statement
Content-Type: text/plain; charset="iso-8859-1"
Here ! Here !
Can you imagine the chaos if the following made x character of length 3?
proc format ;
value onetwo 1 = 'One' 2 = 'Two';
run;
data x;
format x onetwo.;
etc, etc;
run;
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~
Spencer Hudson, Ph.D.
Director, Biostatistics and Clinical Data Management
ViroPharma, Inc
405 Eagleview Blvd
Exton, PA 19341
Phone: (610) 458-7300 ext 154
FAX: (610) 458-7380
email: shudson@viropharma.com
-----Original Message-----
From: Berryhill, Timothy [SMTP:TWB2@PGE.COM]
<mailto:[SMTP:TWB2@PGE.COM]>
Sent: Tuesday, June 02, 1998 11:50 AM
Subject: Re: SAS Trap: ordering of FORMAT= in ATTRIB
statement
Frankly, I find myself agreeing with Tim Church's original
disgruntlement. If I formatted a numeric variable as 7.3, I would not
expect it to become length 7. If I formatted it BINARY32, I would not
expect it to become length 32. We have many formats which translate a
single character code to a meaningful label-I would not expect assigning
one of those to set the length of the base value.
I see a reason why this happens, so it may be a feature rather
than a bug:
when the first mention of a numeric variable is in a format
statement, the length can default to 8. There is no comfortable default
for the length of a character variable. I cannot support 200 as the
default, but anything less has problems, too. Some hapless programmer
just chose the longest formatted value as the default.
Tim Berryhill - Contract Programmer and General Wizard
TWB2@PGE.COM <mailto:TWB2@PGE.COM> or
http://www.aartwolf.com/twb.html <http://www.aartwolf.com/twb.html>
Frequently at Pacific Gas & Electric Co., San Francisco
The correlation coefficient between their views and
my postings is slightly less than 0
> ----------
> From: martin trollope[SMTP:martint@HOLLARD.CO.ZA]
<mailto:[SMTP:martint@HOLLARD.CO.ZA]>
> Reply To: martin trollope
> Sent: Tuesday, June 02, 1998 5:27 AM
> To: SAS-L@UGA.CC.UGA.EDU <mailto:SAS-L@UGA.CC.UGA.EDU>
> Subject: Re: SAS Trap: ordering of FORMAT= in ATTRIB
statement
>
> Point taken, but...
>
> Formats have a default length, in this case 18. Using $bug. is
the
> equivalent of $bug18. Thus, in your second datastep, the first
reference
> to the field is effectively 'Format testvar $bug18.;' hence
the length is
> 18.
>
> Sorry, I didn't mean to come across as if I was lecturing you
at all.
>
> Martin
> ----------
> From: Tim Churches[SMTP:tchurch@ibm.net]
<mailto:[SMTP:tchurch@ibm.net]>
> Sent: 02 June 1998 01:48
> To: martin trollope
> Cc: SAS-L@VM.MARIST.EDU <mailto:SAS-L@VM.MARIST.EDU>
> Subject: Re: SAS Trap: ordering of FORMAT= in ATTRIB
statement
>
> martin trollope wrote:
> > This is not really news to me, although I've never seen the
concept in >
> quite this form on the ATTRIB statement. One of those funny
'hidden'
> > things that SAS does all by itself (and often causes hours
of >
> frustrated
> > debugging) is set the length of the variable to the length
it has at >
> its
> > first appearance in the datastep. And not, as I have
discovered to my
> > chagrin, the first EXECUTED statement, either, but the first
statement
> > when reading from top to bottom of the datastep.
>
> Martin,
>
> Yes, I was aware of that, but in my example, the only value
being
> assigned to the variable testvar in both data steps is "0000",
and the
> length of testvar is explicitly assigned in the ATTRIB
statement. So why
> does testvar have a length of 18, not 4, in the second
dataset? Where
> does the length of 18 come from? Actually, 18 is the length of
the
> format label 'This must be a bug'. But nowhere is the string
> 'This must be a bug' assigned to the variable testvar! The
only thing
> being done is that the format $bug. which contains that string
is
> assigned as the default output format for the testvar
variable, and that
> assignation is in the same ATTRIB statement that assigns a
length of 4
> to the testvar variable. So it seems that the length of a
character
> variable is dependent not just on the order of statements
within a
> dataset, but also on the order of parameters within a
statement, and
> furthermore, the variable length is assigned on the basis of
the maximum
> length of a label in a default output format assigned to that
variable,
> not, as I previously thought, the length of the first value
assigned to
> that variable.
> Just to repeat the offending code:
>
> ************************* ;
> proc format ;
> value $bug
> '0000' = 'This must be a bug'
> other = 'Or maybe a feature' ;
> run ;
>
> data test1 ;
> attrib testvar length=$4 format=$bug. ;
> testvar = '0000' ;
> run ;
>
> data test2 ;
> attrib testvar format=$bug. length=$4 ;
> testvar = '0000' ;
> run ;
>
> proc contents data=test1 ;
> run ;
>
> proc contents data=test2 ;
> run ;
> ************************* ;
>
> Tim Churches
>
|