LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (August 2004, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Tue, 10 Aug 2004 08:51:52 -0600
Reply-To:     Jack Hamilton <JackHamilton@FIRSTHEALTH.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Jack Hamilton <JackHamilton@FIRSTHEALTH.COM>
Subject:      Re: Two basic SAS datastep questions but very urgent!
Comments: To: ben.powell@cla.co.uk
Content-Type: text/plain; charset=us-ascii

Char is stored as 1-32K bytes, depending on how it is (implicitly or explicitly) declared. In a data step, the implicit declaration is the length of the first string assigned to it. A numeric is stored as 2-8 bytes (3-8 bytes on many operating systems). with 8 the default.

Something that catches many new users by surprise:

===== if a = 1 then x ='No'; else x = 'Yes'; =====

will give X a length of 2, because the first string assigned to it is 2 characters long.

In v6 and previous versions of SAS, the maximum char length was 200.

While a data step is actually running, numerics are kept in memory as 8 bytes, even if stored as 3 bytes, and are truncated when written out*. This is what Paul Dorfman was referring to in a previous note.

* I think they're truncated. If someone knows differently, please let me know.

-- JackHamilton@FirstHealth.com Manager, Technical Development Metrics Department, First Health West Sacramento, California USA

>>> "Ben Powell" <Ben.powell@cla.co.uk> 08/09/2004 11:46 PM >>> Thanks for the tip. Is each Char stored as one byte then in SAS, with numerics as eight bytes?

Ben.

-----Original Message----- From: Jack Hamilton [mailto:JackHamilton@firsthealth.com] Sent: 09 August 2004 19:17 To: ben.powell@CLA.CO.UK; SAS-L@LISTSERV.UGA.EDU Subject: Re: [SAS-L] Two basic SAS datastep questions but very urgent!

It would probably be preferable to use a character flag set to Y/N (or the local equivalent).

On most platforms, a numeric variable will take 8 times more storage space than a one-byte character value by default (3 times in the best case) and requires additional processing to display for output.

Using a numeric flag saves a tiny bit of typing ("if flag" instead if "if flag = 'Y'"), but unless you want to perform calculations on the field (in which case it's not really a flag), storing flags as short character fields is better.

-- JackHamilton@FirstHealth.com Manager, Technical Development Metrics Department, First Health West Sacramento, California USA

>>> <ben.powell@CLA.CO.UK> 08/09/2004 6:32 AM >>> To mark specific records you could use a flag, i.e set variable flag = 1 or 0 depending whether it is one of your select values or not.

For instance,

data test; set test; if _N_ in(7 90 100 101 103 400 523 633 635 800) then flag = 1;/*OR USE ANOTHER METHOD TO SELECT THE RECORDS, FOR INSTANCE IF THEY ARE IN A CERTAIN RANGE.*/ else flag = 0; run;

data subset; set test; if flag = 1; run;

or

proc freq data = test (where=(flag=1));table zipcode;run;

Incidentally, the final suggestion above will solve your second problem and give you a distinct list of all zipcodes should you remove the where clause. Alternatively you could create an output dataset:

proc freq data = test; table zipcode / out = tf; run;

Alternatively, try sql:

proc sql; create table z_count as select distinct zipcode, count(*) as qty from test group by zipcode; quit;

HTH.

On Sat, 7 Aug 2004 13:19:08 -0700, Se Yan <laseyan@GMAIL.COM> wrote:

>1. Is there any method in SAS that can mark some specific records in

>my dataset? I have a dataset with 10000 records, but i have ten >records that are special and need to be marked, so that in the further >procedures I can choose to include or exclude these records, but i >don't want to split the dataset into two files. How can I do that? > >2. I have 10000 individual records, but there is a variable "zipcode", >which only has 500 different values, i.e, average 20 persons per >zipcode in the dataset. Which command can give me the list of these >500 different values of zipcodes? > >I know the problems are quite basic. BUt i madly need help!


Back to: Top of message | Previous page | Main SAS-L page