| Date: | Tue, 14 Aug 2007 13:39:22 -0700 |
| Reply-To: | David L Cassell <davidlcassell@MSN.COM> |
| Sender: | "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU> |
| From: | David L Cassell <davidlcassell@MSN.COM> |
| Subject: | Re: newbie question |
| In-Reply-To: | <46A8AC22.1000909@vdh.virginia.gov> |
| Content-Type: | text/plain; format=flowed |
|---|
Carolyn.Halbert@VDH.VIRGINIA.GOV wrote:
>Hi Guys,
>I am a newbie so please be kind.
As you know by now, we are remarkably kind for an internet gathering.
(Well, except for me. I'm a crabby old statistician.) So welcome to the
group.
> I am trying to clean some data
>variables that are suppossed to have only data 1-9 and have some
>extraneous letters and 2 digit numbers out of that range. I don't quite
>understand where to put the space and multiple tries keep coming up with
>this error statement.
>I am also trying to do some cleaning using a where statement and only
>sometimes can get it to run.
>
>i.e. where stage00 in ('1','2','3',);
>or where stage00 in ('1'-'3',);
>
>Comments? Suggestions?
>Thanks
>
>Code:
>data work.debug;
> set work.rmcds;
> if stage00='1-9' then keep;
> if stage00 ne '1-9' then drop;
> run;
>
>
>Error statement NOTE 49-169: The meaning of an identifier after a quoted
>string may change in a future SAS release. Inserting white space
>between a quoted string and the succeeding identifier is recommended.
Let me add a couple thoughts that I have not seen covered by all the
wise posters who have chipped in.
KEEP and DROP are tools for deciding which variables to hang onto, not
tools for deleting rows out of your data set. For that, you want the
DELETE statement.
But you can just use the IF statement alone for this. If we say:
if stage00 ne '1';
then this acts like a subsetting tool for us. It will only keep the records
where STAGE00 starts with 1 .
If you are looking to spot things with a single digit only in a string, you
can
also use some of the more advanced tools. If you have seen regular
expressions
before then this might be useful:
if prxmatch('/^\s*\d\s*$/', stage00);
This actually forces the match to handle spaces either before or after the
single
digit, and makes sure that there are not other nuisances in your string
STAGE00 .
Here's how it works:
/ start regex
^ make it match at the start of the string
\s* optional spaces/tabs at the front
\d 1 digit only (0-9 here even though you said 1-9)
\s* optional spaces at the end
$ make sure it matches the end of the string with no extra crud
afterward
/ end of regex
If you really need exactly 1-9 and not the zero, then we can change this
like so:
if prxmatch('/^\s*[1-9]\s*$/', stage00);
Here's the explanation:
/ start regex
^ make it match at the start of the string
\s* optional spaces/tabs at the front
[1-9] 1 digit only, and it must be 1-9
\s* optional spaces at the end
$ make sure it matches the end of the string with no extra crud
afterward
/ end of regex
Of course, if the string can only be 2 characters, or there are limits on
where
the spaces can be, then we can simplify this. But SAS has a host of string
handling tools.
HTH,
David
--
David L. Cassell
mathematical statistician
Design Pathways
3115 NW Norwood Pl.
Corvallis OR 97330
_________________________________________________________________
Learn.Laugh.Share. Reallivemoms is right place!
http://www.reallivemoms.com?ocid=TXT_TAGHM&loc=us
|