Date: Tue, 2 Feb 2010 13:09:49 -0600
Reply-To: Joe Matise <snoopy369@GMAIL.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Joe Matise <snoopy369@GMAIL.COM>
Subject: Re: help with counting occurences of string
In-Reply-To: <E2B5EBDD92CF104E9AC69D6B563466C903273756@EDUNIVMAIL11.ad.umassmed.edu>
Content-Type: text/plain; charset=windows-1252
I'd to a proc freq of protocolid*topic_code out to a dataset. Then you
could subset that dataset by only the topic_codes you care about ('0', '1a'
etc.), and then you have protocolid/topic_code that you care about; you can
do whatever you want with it then [whatever analysis you want to do BY
protocolid, generally].
Are you just looking at the topic_code value as a whole, or is there a need
to 'search' inside of the strings?
-Joe
On Tue, Feb 2, 2010 at 12:58 PM, Simon, Lorna <Lorna.Simon@umassmed.edu>wrote:
> Thanks this helps. i do just want the count. Your code would give me what
> I want, but you are right – I do have a lot of codes (strings) to search
> for. I do have a pre-transposed dataset. It contains the fields codingid
> (which I don’t need), topic_code and protocolid. I need to look at the
> occurrences of the strings within each topic_code for each protocol id. the
> data look like this:
>
> Codingid topic_code protocolid
>
> 8900 1ai 98a1
>
> 8901 1aii 98a2
>
> 8902 2c 98a3
>
> 9000 10c 5b1
>
> 9001 23ci 5a10
>
>
>
> Etc.
>
>
>
> Thank you so much for your help.
>
>
>
> *From:* Joe Matise [mailto:snoopy369@gmail.com]
> *Sent:* Tuesday, February 02, 2010 1:44 PM
> *To:* Simon, Lorna
> *Cc:* SAS-L@listserv.uga.edu
> *Subject:* Re: help with counting occurences of string
>
>
>
> What do you want as output? Just the count?
> Your if/else is the issue then in the array. Try
> first: none=0
> Then array loop. Inside:
> Array codes {*} col1-col571;
> Do i=1 to dim(codes);
> if codes(i)="0" then none=none+1;
> End;
>
> That will count up the occurences. To do multiple values, either have
> another array that you separately iterate through, and a matching array of
> counts, or create a hash table and do lookups through that to find it.
>
> Would be a lot easier if you had pre-transposed dataset, or even just
> re-transposed it yourself. I have a feeling if you have a lot of strings to
> search for, you're going to take way too long to do it this way
> (horizontal).
>
> -Joe
>
> On Tue, Feb 2, 2010 at 12:39 PM, Simon, Lorna <Lorna.Simon@umassmed.edu>
> wrote:
>
> I have a transposed dataset which has variables col1-col571. I need to
> search all of these variables for the occurrence of various strings e.g.
> 0, 1ai 1aii, 1d, 2a, 2b 2c, etc. I have looked at sas functions and
> cannot find anything there. I think the problem with this is that the
> functions seem to all call for 1 variable to search and I have 571. I
> worked with another transposed dataset like this where I had variables
> col1-col30 and I just used if statements like this:
> If col1="0" or col2="0" or col3="0" or col4="0" etc. obviously I can't
> do this for 571 variables. The other wrinkle is that some of the strings
> I want to differentiate between are contained in other strings. For
> example, I need to differentiate between "1aii" and "1ai" but "1ai" is
> contained in "1aii"
>
> I've also tried using an array statement:
> Array codes {*} col1-col571;
> Do i=1 to dim(codes);
> If codes{i}="0" then none=1;
> Else none=0;
> End;
>
> This just returns a value of 0 for every case.
>
> Is there a function that would do this? do I need a macro? I'm really
> beyond my depth. Can anyone help? Thank you.
>
>
>
|