| Date: | Mon, 29 Jul 2002 11:45:30 -0700 |
| Reply-To: | paula D <sophe@USA.NET> |
| Sender: | "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU> |
| From: | paula D <sophe@USA.NET> |
| Organization: | http://groups.google.com/ |
| Subject: | Consolidating categorical data |
| Content-Type: | text/plain; charset=ISO-8859-1 |
I have one character variable (slogan) with values like this:
visa1-mast2-amex3;
visa2-mast3-amex9;
mast7-visa2-amex8;
....
that is, every value randomly takes 3 of (visa1-visa50, mast1-mast50,
amex1-amex150) and combine with 2 "-" in between. It is Ok if the
value is
visa1-visa1-visa1.
My problem is if the 3 components in any 2 values are the same, such
as
visa1-mast2-amex3 and mast2-visa1-amex3,
I want to treat them as the same, taking 1 of the 2 values to
represent them.
So far I have tried one idea. It works, but too complicated. The idea
is to assign a value to each component in such as way that
1. comp1=substr(slogan, 1, 5); comp2=substr(slogan, 7,5);
comp3=substr(slogan, 13,5);
2. if comp1="visa1" then tag1=1; else if comp1="visa2" then tag1=10;
else if comp1="visa3" then tag1=100......;
3. if comp2="visa1" then tag2=1; else if comp2="visa2" then tag2=10;
else if comp2="visa3" then tag2=100......;
then the total of tag1-tag3 is at a unique range to correspond to each
unique original value in slogan (in this sense, the magnitude does not
have to be 10). Finally, re-format the totals back using original
Slogan values.
I would appreciate if someone can help with any better solution. I
guess I am not the first person who has ever needed to do this, but I
have no idea what keywords to use to query the SAS-L to find previous
listings in this regard. Thank you in advance.
Again, my old email sophe@usa.net is no longer in use. Please use
sophe88@yahoo.com for emails. Thanks.
Paula D
|