Date: Wed, 28 Oct 2009 19:16:26 -0700
Reply-To: xlr82sas <xlr82sas@AOL.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: xlr82sas <xlr82sas@AOL.COM>
Organization: http://groups.google.com
Subject: Re: Decode a string?
Content-Type: text/plain; charset=ISO-8859-1
On Oct 28, 1:20 pm, snoopy...@GMAIL.COM (Joe Matise) wrote:
> What's the benefit of doing it as a user-written function over a macro - is
> there a significant performance gain?
>
> -Joe
>
> On Wed, Oct 28, 2009 at 2:56 PM, Muthia Kachirayan <
>
>
>
> muthia.kachira...@gmail.com> wrote:
> > User-written function will be very handy here.
>
> > proc format lib=work;
> > value xx 1='ABC' 2='XYZL' 3='FSFGS';
> > quit;
>
> > options cmplib = work.funcs;
> > proc fcmp outlib = work.funcs.str;
> > function n2s(x $) $ 500;
> > length newx $500 ;
> > len = length(x);
> > do _n_ = 1 to len;
> > N = input(substr(x,_n_,1),1.);
> > newx = catx(',', newx, put(N, xx.));
> > end;
> > return(newx);
> > endsub;
>
> > The function, n2s(), is called in the following data step.
>
> > data test;
> > input x $;
> > newx = n2s(x);
> > datalines;
> > 123
> > 112
> > 12
> > 222
> > 1
> > 11
> > ;
> > run;
>
> > The output of data set, TEST :
>
> > Obs x newx
> > 1 123 ABC,XYZL,FSFGS
> > 2 112 ABC,ABC,XYZL
> > 3 12 ABC,XYZL
> > 4 222 XYZL,XYZL,XYZL
> > 5 1 ABC
> > 6 11 ABC,ABC
>
> > Muthia Kachirayan
>
> > On Wed, Oct 28, 2009 at 8:46 AM, Mike Rhoads <RHOAD...@westat.com> wrote:
>
> > > Or how about doing it with a user-written function, if you have 9.2?
>
> > > I seem to remember that these can be called from within PROC SQL, but
> > > haven't actually checked.
>
> > > Mike Rhoads
> > > Rhoad...@Westat.com
>
> > > -----Original Message-----
> > > From: SAS(r) Discussion [mailto:SA...@LISTSERV.UGA.EDU] On Behalf Of
> > > Huang, Ya
> > > Sent: Tuesday, October 27, 2009 5:32 PM
> > > To: SA...@LISTSERV.UGA.EDU
> > > Subject: Re: Decode a string?
>
> > > Tranwrd() or Transtrn() like you said, when maximum number of value is
> > > assumed, can make a one-liner.
> > > Another assumption is that the decoded string can not have the numbers in
> > > original string:
>
> > > 223 data test;
> > > 224 input x $;
> > > 225 format newx $500.;
> > > 226 newx =
> > > 226!
> > > tranwrd(tranwrd(tranwrd(x,'1',trim(put(1,xx.-l))||','),'2',trim(put(2,xx
> > > .-l))||','),'3',trim
> > > 226! (put(3,xx.-l))||',');
> > > 227 put x= newx=;
> > > 228 datalines;
>
> > > x=123 newx=ABC,XYZL,FSFGS,
> > > x=112 newx=ABC,ABC,XYZL,
> > > x=12 newx=ABC,XYZL,
> > > x=1 newx=ABC,
> > > x=11 newx=ABC,ABC,
>
> > > ________________________________
>
> > > From: Joe Matise [mailto:snoopy...@gmail.com]
> > > Sent: Tuesday, October 27, 2009 2:07 PM
> > > To: Huang, Ya
> > > Cc: SA...@listserv.uga.edu
> > > Subject: Re: Decode a string?
>
> > > Not that I could imagine, in a single step, unless you could make some
> > > assumptions about maximum number of values. But I'm far from a SQL
> > expert.
>
> > > A non-elegant solution... (this could be macroed, I guess, if you have
> > more
> > > than a few long possibly):
>
> > > proc format lib=work;
> > > value xx
> > > 1='ABC'
> > > 2='XYZL'
> > > 3='FSFGS'
> > > ;
> > > quit;
>
> > > data test;
> > > input x $;
> > > format newx $500.;
> > > newx = put(input(substr(x,1,1),BEST.),xx.);
> > > if length(x) ge 2 then do _t = 2 to length(x);
> > > newx=cats(newx,',',put(input(substr(x,_t,1),BEST.),xx.));
> > > end;
> > > put x= newx=;
> > > datalines;
> > > 123
> > > 112
> > > 12
> > > 1
> > > 11
> > > ;
> > > run;
> > > proc sql;
> > > select catx(',',put(input(substr(x,1,1),BEST.),xx.),
> > > case when length(x) ge 2 then put(input(substr(x,2,1),BEST.),xx.) end ,
> > > case when length(x) ge 3 then put(input(substr(x,3,1),BEST.),xx.) end
> > > )
> > > from test;
> > > quit;
>
> > > -Joe
>
> > > On Tue, Oct 27, 2009 at 3:58 PM, Ya Huang <ya.hu...@amylin.com> wrote:
>
> > > Thanks to Joe and _null_. Wonder if someone can come up with
> > > a non-looping one, so it can be used in SQL?
>
> > > On Tue, 27 Oct 2009 14:36:40 -0500, Joe Matise <
> > snoopy...@GMAIL.COM>
> > > wrote:
>
> > > >Gah, yes, was trying to think of 'how do I put commas in between'
> > > and
> > > >somehow managed to overlook CATX, which I probably use half a
> > dozen
> > > times
> > > >per day... not enough caffeine ...
>
> > > >Why vvalue() over put()? Not sure they're all that different in
> > > terms of
>
> > > >conciseness... is vvalue() faster at execution time?
>
> > > >-Joe
>
> > > >On Tue, Oct 27, 2009 at 2:31 PM, Data _null_; <
> > iebup...@gmail.com>
> > > wrote:
>
> > > >> If you take advantage of the features of CATX and use VVALUE
> > this
> > > can
> > > >> be written slightly more concisely.
>
> > > >> proc format lib=work;
> > > >> value xx 1='ABC' 2='XYZL' 3='FSFGS';
> > > >> quit;
>
> > > >> data test;
> > > >> input x $;
> > > >> length newx $500;
> > > >> do _n_ = 1 to length(x);
> > > >> xN = input(substr(x,_n_,1),F1.);
> > > >> newx = catx(',',newx,vvalue(xN));
> > > >> end;
> > > >> format xN xx.;
> > > >> drop xN;
> > > >> datalines;
> > > >> 123
> > > >> 112
> > > >> 12
> > > >> 222
> > > >> 1
> > > >> 11
> > > >> ;
> > > >> run;
> > > >> proc print;
> > > >> run;
>
> > > >> On 10/27/09, Joe Matise <snoopy...@gmail.com> wrote:
>
> > > >> > proc format lib=work;
> > > >> > value xx
> > > >> > 1='ABC'
> > > >> > 2='XYZL'
> > > >> > 3='FSFGS'
> > > >> > ;
> > > >> > quit;
>
> > > >> > data test;
> > > >> > input x $;
> > > >> > format newx $500.;
> > > >> > newx = put(input(substr(x,1,1),BEST.),xx.);
> > > >> > if length(x) ge 2 then do _t = 2 to length(x);
> > > >> > newx=cats(newx,',',put(input(substr(x,_t,1),BEST.),xx.));
> > > >> > end;
> > > >> > put x= newx=;
> > > >> > datalines;
> > > >> > 123
> > > >> > 112
> > > >> > 12
> > > >> > 1
> > > >> > 11
> > > >> > ;
> > > >> > run;
>
> > > >> > Probably could also have some fun with TRANWRD, or hash
> > > solution if you
> > > >> > don't need to use a format.
>
> > > >> > -Joe
>
> > > >> > On Tue, Oct 27, 2009 at 2:04 PM, Ya Huang <
> > ya.hu...@amylin.com>
> > > wrote:
>
> > > >> > > Hi there,
>
> > > >> > > Assuming I have a format, something like this.
>
> > > >> > > proc format;
> > > >> > > value xx
> > > >> > > 1='ABC'
> > > >> > > 2='XYZL'
> > > >> > > 3='FSFGS'
> > > >> > > ;
>
> > > >> > > Now I have a string '213', and I want a decoded string
> > > >> 'XYZL,ABC,FSFGS'.
> > > >> > > The string can be '123','312','223','1','21' etc., which is
> > > dynamically
> > > >> > > generated. The decoded string should follow the order of
> > the
> > > original
> > > >> > > number in the string if more than one numbers are in the
> > > original
> > > >> string.
>
> > > >> > > Looking for some elegant solutions.
>
> > > >> > > Thanks
>
> > > >> > > Ya- Hide quoted text -
>
> - Show quoted text -
This is a much less elagant algorithm Joe's or data_null's and I would
not do it.
The only way to do this without looping would be to create a long and
skinny dataset ie variables group and onechr
If your string is a maximum of N characters wide then you would create
up to N output records per input observation.
This would require at least N sas statements.
Next you coud probably use a DOW structure(with notsorted option) to
lookup, format and concatenate.
There is probably a 'proc transpose soluton somewhere in her.
|