LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (May 2001, week 5)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Thu, 31 May 2001 11:14:51 -0400
Reply-To:     Ian Whitlock <WHITLOI1@WESTAT.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Ian Whitlock <WHITLOI1@WESTAT.COM>
Subject:      Re: Changing character to number format
Comments: To: Marc Gameroff <gameroff@EROLS.COM>
Content-Type: text/plain; charset="iso-8859-1"

Subject: Changing character to number format Summary: Simple 2 step process illustrating list processing Respondent: Ian Whitlock <whitloi1@westat.com>

Marc Gameroff [gameroff@EROLS.COM] wrote:

>>>>>>>>>>>>>>>>>>>>>>>> I have a dataset with 740 variables, all in character format, and all of length 346. (Don't ask.) About 95% of the variables are really numeric, i.e., I need to convert them to numeric format in order to analyze them. Also, I want to reduce the length of these variables to 4 (which will accomodate the largest numeric value, all of which are integers). I know I can't change the format of a variable directly and would have to use some type of interim variable method, but the intricacies are beyond me at this point.

For the 5% of the variables containing "true" character data (they are text responses), I want to retain the character format but reduce the storage length of the variables if this is easy enough to do. Some of these variables have text that never exceeds 15 characters, whereas some contain up to--surprise!--346 characters. Any help would be greatly appreciated! <<<<<<<<<<<<<<<<<<<<<<<<<<

Sometime shortly before SUGI Ya Huang gave an illustration of a conversion problem showing that SQL made for the simplest code to handle rename problems in the conversion.

A DATA step could look at the file and decide which variables are capable of conversion to numeric and what the maximum length is for each variable that is not. At the end of file one could summariaze this information into two lists and convert it to two macro variables that could then be used in the SQL code to do the conversion.

Here is the program together with some test data.

/* make test data */ data w ; length x1 - x5 $ 346 ; input ( x1 - x5 ) ( $ ) ; cards ; 123 123 xxxxxxxxxxxxxxxxxxxxxxxxxxx 45 6 2 a 456 7 8 ;

/* need total number of character variables on file */ %let nv = 5 ;

/* pass 1 - get list of numeric vars, character vars and lengths */

data _null_ ; if eof then link final ; set w end = eof ; array __chr (&nv) _character_ ; array __len (&nv) _temporary_ ;

do __i = 1 to dim ( __chr ) ; __temp = input ( left ( __chr(__i) ) , ?? best12. ) ; if __temp = . then do ; __temp = length ( __chr ( __i ) ) ; if __temp > __len ( __i ) then __len ( __i ) = __temp ; end ; end ; return ;

final: length __chrlist __numlist $32000 __vname $ 32 ; do __i = 1 to dim ( __chr ) ; __vname = vname ( __chr ( __i ) ) ; if __len ( __i ) > 0 then do ; /* 7 because -999999 can be stored in 4 bytes */ __clen = max ( __len ( __i ) , 7 ) ; __chrlist = trim ( __chrlist ) || ", " || trim(__vname) || " length=" || put ( __clen , 4. ) ; end ; else __numlist = trim ( __numlist ) || ", " || " input ( left (" || trim(__vname) || ") , best12. ) as " || trim(__vname) || " length=4" ; end ; call symput ( "chrlist" , trim(substr(__chrlist,3)) ) ; call symput ( "numlist" , trim(substr(__numlist,3)) ) ; return ; run ;

/* pass 2 - do the conversion %put numlis=&numlist ; %put chrlist=&chrlist ; */ proc sql ; create table w2 as select &numlist , &chrlist from w ; quit ;

The technique illustrates the power of SAS to make lists and how to put this power into program. The above code is simpler in version 8 because long character variables are available to hold lists. Since Marc has data of length 346 I assume he must be using a version beyond 6.12. If one wants to do the problem in version 6.12 only the subroutine, FINAL, should need changing. Now one would have to use CALL VNAME to get the variable names and CALL EXECUTE to do the concatenation to fom the macro lists.

Ian Whitlock


Back to: Top of message | Previous page | Main SAS-L page