| Date: | Wed, 28 Feb 1996 18:58:26 +0000 |
| Reply-To: | John Whittington <johnw@MAG-NET.CO.UK> |
| Sender: | "SAS(r) Discussion" <SAS-L@UGA.CC.UGA.EDU> |
| From: | John Whittington <johnw@MAG-NET.CO.UK> |
| Subject: | Re: input function question |
|
On Wed, 28 Feb 1996, Karsten Self <karsten@NEWAGE1.STANFORD.EDU> wrote:
>Adding my two bits -- explicit conversion allows control over exception
>handling -- unless you prefer SAS coughing up an error in the middle of
>your 1.25 million record data read.......
>
>When working with raw data my rule of thumb is to read everything as
>character and convert to numeric, after testing for bad values. I've used
>seven or eight case SELECT/WHEN statements in making date value
>conversions.
Sure. I presume the problem only really arises with character-to-number
conversion, since any valid numeric variable value ought to be convertible
to character. I don't think anyone would suggest that one should ise
implicit conversions on data which might be invalid; that invalid data
needs to be 'sorted out' for any question of conversion (implicit or
explicit) arises.
>Regarding efficiency, I'm unable to substantiate my claim, but disagree
>with John. My understanding is that implicit type conversions are less
>efficient.
I still can't find the comment in TFM I was thinking of; maybe I dreamed it
:-) I have therefore 'done the experiemnt' and, since Karsten mentiopned
1.25 million obs, that's what I've used (6.10 on 60 MHz Pentium). The log
which follows shows that Karsten seems to be just about right, although the
difference between implicit and explicit char-to-num conversion (116.2
seconds vs. 107.6 seconds) is really not worth talking about:
73
74 data test (drop=i);
75 do i=1 to 1250000; charvar='9999'; output; end;
76 run;
NOTE: The data set WORK.TEST has 1250000 observations and 1 variables.
NOTE: The DATA statement used 26.25 seconds.
77
78 data one;
79 set test;
80 numvar=charvar+1 ;
81 run;
NOTE: Character values have been converted to numeric values at the places
given by: (Line):(Column).
80:13
NOTE: The data set WORK.ONE has 1250000 observations and 2 variables.
NOTE: The DATA statement used 1 minute 56.22 seconds.
82
83 data one;
84 set test;
85 numvar=input(charvar, 4.0) + 1;
86 run;
NOTE: The data set WORK.ONE has 1250000 observations and 2 variables.
NOTE: The DATA statement used 1 minute 47.6 seconds.
... so perhaps I did dream it!!!
John
-----------------------------------------------------------
Dr John Whittington, Voice: +44 1296 730225
Mediscience Services Fax: +44 1296 738893
Twyford Manor, Twyford, E-mail: johnw@mag-net.co.uk
Buckingham MK18 4EL, UK CompuServe: 100517,3677
-----------------------------------------------------------
|