Date: Tue, 10 Aug 2004 08:51:52 -0600
Reply-To: Jack Hamilton <JackHamilton@FIRSTHEALTH.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Jack Hamilton <JackHamilton@FIRSTHEALTH.COM>
Subject: Re: Two basic SAS datastep questions but very urgent!
Content-Type: text/plain; charset=us-ascii
Char is stored as 1-32K bytes, depending on how it is (implicitly or
explicitly) declared. In a data step, the implicit declaration is the
length of the first string assigned to it. A numeric is stored as 2-8
bytes (3-8 bytes on many operating systems). with 8 the default.
Something that catches many new users by surprise:
=====
if a = 1 then
x ='No';
else
x = 'Yes';
=====
will give X a length of 2, because the first string assigned to it is 2
characters long.
In v6 and previous versions of SAS, the maximum char length was 200.
While a data step is actually running, numerics are kept in memory as 8
bytes, even if stored as 3 bytes, and are truncated when written out*.
This is what Paul Dorfman was referring to in a previous note.
* I think they're truncated. If someone knows differently, please let
me know.
--
JackHamilton@FirstHealth.com
Manager, Technical Development
Metrics Department, First Health
West Sacramento, California USA
>>> "Ben Powell" <Ben.powell@cla.co.uk> 08/09/2004 11:46 PM >>>
Thanks for the tip. Is each Char stored as one byte then in SAS, with
numerics as eight bytes?
Ben.
-----Original Message-----
From: Jack Hamilton [mailto:JackHamilton@firsthealth.com]
Sent: 09 August 2004 19:17
To: ben.powell@CLA.CO.UK; SAS-L@LISTSERV.UGA.EDU
Subject: Re: [SAS-L] Two basic SAS datastep questions but very urgent!
It would probably be preferable to use a character flag set to Y/N (or
the
local equivalent).
On most platforms, a numeric variable will take 8 times more storage
space
than a one-byte character value by default (3 times in the best
case) and requires additional processing to display for output.
Using a numeric flag saves a tiny bit of typing ("if flag" instead if
"if
flag = 'Y'"), but unless you want to perform calculations on the field
(in
which case it's not really a flag), storing flags as short character
fields
is better.
--
JackHamilton@FirstHealth.com
Manager, Technical Development
Metrics Department, First Health
West Sacramento, California USA
>>> <ben.powell@CLA.CO.UK> 08/09/2004 6:32 AM >>>
To mark specific records you could use a flag, i.e set variable flag =
1 or
0 depending whether it is one of your select values or not.
For instance,
data test;
set test;
if _N_ in(7 90 100 101 103 400 523 633 635 800) then flag = 1;/*OR USE
ANOTHER METHOD TO SELECT THE RECORDS, FOR INSTANCE IF THEY ARE IN A
CERTAIN
RANGE.*/ else flag = 0; run;
data subset;
set test;
if flag = 1;
run;
or
proc freq data = test (where=(flag=1));table zipcode;run;
Incidentally, the final suggestion above will solve your second problem
and
give you a distinct list of all zipcodes should you remove the where
clause.
Alternatively you could create an output dataset:
proc freq data = test;
table zipcode / out = tf;
run;
Alternatively, try sql:
proc sql;
create table z_count as
select distinct zipcode, count(*) as qty
from test
group by zipcode;
quit;
HTH.
On Sat, 7 Aug 2004 13:19:08 -0700, Se Yan <laseyan@GMAIL.COM> wrote:
>1. Is there any method in SAS that can mark some specific records in
>my dataset? I have a dataset with 10000 records, but i have ten
>records that are special and need to be marked, so that in the
further
>procedures I can choose to include or exclude these records, but i
>don't want to split the dataset into two files. How can I do that?
>
>2. I have 10000 individual records, but there is a variable
"zipcode",
>which only has 500 different values, i.e, average 20 persons per
>zipcode in the dataset. Which command can give me the list of these
>500 different values of zipcodes?
>
>I know the problems are quite basic. BUt i madly need help!