LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (August 2007, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Tue, 14 Aug 2007 13:39:22 -0700
Reply-To:   David L Cassell <davidlcassell@MSN.COM>
Sender:   "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:   David L Cassell <davidlcassell@MSN.COM>
Subject:   Re: newbie question
In-Reply-To:   <46A8AC22.1000909@vdh.virginia.gov>
Content-Type:   text/plain; format=flowed

Carolyn.Halbert@VDH.VIRGINIA.GOV wrote: >Hi Guys, >I am a newbie so please be kind.

As you know by now, we are remarkably kind for an internet gathering. (Well, except for me. I'm a crabby old statistician.) So welcome to the group.

> I am trying to clean some data >variables that are suppossed to have only data 1-9 and have some >extraneous letters and 2 digit numbers out of that range. I don't quite >understand where to put the space and multiple tries keep coming up with >this error statement. >I am also trying to do some cleaning using a where statement and only >sometimes can get it to run. > >i.e. where stage00 in ('1','2','3',); >or where stage00 in ('1'-'3',); > >Comments? Suggestions? >Thanks > >Code: >data work.debug; > set work.rmcds; > if stage00='1-9' then keep; > if stage00 ne '1-9' then drop; > run; > > >Error statement NOTE 49-169: The meaning of an identifier after a quoted >string may change in a future SAS release. Inserting white space >between a quoted string and the succeeding identifier is recommended.

Let me add a couple thoughts that I have not seen covered by all the wise posters who have chipped in.

KEEP and DROP are tools for deciding which variables to hang onto, not tools for deleting rows out of your data set. For that, you want the DELETE statement.

But you can just use the IF statement alone for this. If we say:

if stage00 ne '1';

then this acts like a subsetting tool for us. It will only keep the records where STAGE00 starts with 1 .

If you are looking to spot things with a single digit only in a string, you can also use some of the more advanced tools. If you have seen regular expressions before then this might be useful:

if prxmatch('/^\s*\d\s*$/', stage00);

This actually forces the match to handle spaces either before or after the single digit, and makes sure that there are not other nuisances in your string STAGE00 . Here's how it works:

/ start regex ^ make it match at the start of the string \s* optional spaces/tabs at the front \d 1 digit only (0-9 here even though you said 1-9) \s* optional spaces at the end $ make sure it matches the end of the string with no extra crud afterward / end of regex

If you really need exactly 1-9 and not the zero, then we can change this like so:

if prxmatch('/^\s*[1-9]\s*$/', stage00);

Here's the explanation:

/ start regex ^ make it match at the start of the string \s* optional spaces/tabs at the front [1-9] 1 digit only, and it must be 1-9 \s* optional spaces at the end $ make sure it matches the end of the string with no extra crud afterward / end of regex

Of course, if the string can only be 2 characters, or there are limits on where the spaces can be, then we can simplify this. But SAS has a host of string handling tools.

HTH, David -- David L. Cassell mathematical statistician Design Pathways 3115 NW Norwood Pl. Corvallis OR 97330

_________________________________________________________________ Learn.Laugh.Share. Reallivemoms is right place! http://www.reallivemoms.com?ocid=TXT_TAGHM&loc=us


Back to: Top of message | Previous page | Main SAS-L page