LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (December 2010, week 4)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Mon, 27 Dec 2010 21:30:44 -0600
Reply-To:   Warren Schlechte <Warren.Schlechte@TPWD.STATE.TX.US>
Sender:   "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:   Warren Schlechte <Warren.Schlechte@TPWD.STATE.TX.US>
Subject:   Re: Is there an easier way to solve this?
Content-Type:   text/plain; charset="iso-8859-1"

I've been following this thread and thinking the same. Seems to make the code robust you need to consider issues with the data entry, unless you have a QA/QC program on the front end.

Warren Schlechte

-----Original Message----- From: toby dunn [mailto:tobydunn@HOTMAIL.COM] Sent: Mon 12/27/2010 8:05 PM Subject: Re: Is there an easier way to solve this?

Art,

Since you essentually dealing with a free form text string I have a question. Is there the possibility that any of the three parts you want to parse will be missing?

Toby Dunn

"I'm a hell bent 100% Texan til I die"

"Don't touch my Willie, I don't know you that well"

> Date: Mon, 27 Dec 2010 20:14:40 -0500 > From: art297@ROGERS.COM > Subject: Re: Is there an easier way to solve this? > To: SAS-L@LISTSERV.UGA.EDU > > Max, > > Tonight you get a free lesson. > > When you want to know if something is more or less efficient, DON'T ask the > list! Just build some code that allows you to make the two (or however > many) sets of code comparable and run them. > > If you find something that the rest of us might be interested, then share > the results. > > Thus, for your current question, wouldn't something like the following tell > you what you want to know?: > > data _null_; > file "c:\testdata.txt"; > input; > do i=1 to 100000; > put _infile_; > end; > cards; > Uptown, Smalltown XX 12345 > The City of New York Big Apple Big Apple NY 85044 > ; > run; > > data address; > length city $40 state $2 zip $5; > informat zip $revers5.; > informat state $revers2.; > informat city $revers40.; > infile "c:\testdata.txt"; > input @; > _infile_=reverse(_infile_); > input zip state city &; > run; > > data b; > Length zip $ 5 state $ 2 city $ 40; > infile "c:\testdata.txt"; > input; > Zip=scan(_infile_, -1); > State=scan(_infile_, -2); > City=tranwrd(_infile_, state||" "||zip, " "); > run; > > The log will tell you everything that you want to know and you, then, can > decide if it is just something to learn or something you should share. > > In the present case, it might just be worth sharing. > > Art > ------- > On Mon, 27 Dec 2010 18:26:35 -0500, bbser 2009 <bbser2009@GMAIL.COM> wrote: > > >Nat > > > > > >Yes. And alternative to my first code using tranwrd(), maybe it would be > >robust to adjust it like this (using tranwrd() twice instead of just once): > > > >... > >zip=scan(x, -1); > >temp=tranwrd(x, zip, " "); *First get rid of the value of zip from the full > >string in x; > >state=scan(temp, -1); *It is minus one, not minus two; > >city=tranwrd(temp, state, " "); *Secondly get rid of state from temp; > >... > > > >I guess this will get rid of the problem of varying number of blanks and is > >better than my second code where I used scan() and catx(). > >Now I am wondering, how this compares to Saren's in term of efficiency? > > > >Max > > > >-----Original Message----- > >From: Nat Wooding [mailto:nathani@verizon.net] > >Sent: December-27-10 5:28 PM > >To: 'bbser 2009' > >Subject: RE: [SAS-L] Is there an easier way to solve this? > > > >Max > > > >I don't use Tranwrd enough to know all of its nuances. If you used > > > >Record = compbl( record ); > > > >You would get rid of extra blanks. > > > >Nat > > > >-----Original Message----- > >From: bbser 2009 [mailto:bbser2009@gmail.com] > >Sent: Monday, December 27, 2010 5:04 PM > >To: 'Nat Wooding' > >Cc: SAS-L@LISTSERV.UGA.EDU > >Subject: RE: [SAS-L] Is there an easier way to solve this? > > > >Nat > > > >Thanks for let me know the "continue" statement. Glad to add it to my > >arsenal. > >As for my earlier code, i just thought it might not be robust or something. > >For example, if some of the records like below have two more spaces between > >NY and 111111. > > > >xxxx xxx xx NY 111111 > > > >Then using tranwrd(record, a||""||b, "") does not replace "NY 111111" > >totally with blanks. > > > >Max > > > >-----Original Message----- > >From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of Nat > >Wooding > >Sent: December-27-10 4:08 PM > >To: SAS-L@LISTSERV.UGA.EDU > >Subject: Re: [SAS-L] Is there an easier way to solve this? > > > >Max > > > >SAS fussed at me until I made the variable for City longer (I used 50). > > > >Your earlier solution was simpler but this does work. One thing that I > would > >do would be to stop the loop as soon as it got to the end of the words in > >the string as in the following code. > > > >Nat > > > >do i=1 to 20; > > word[i]=scan(x,-i); > > if word[i]='' then continue;**<<<< new line; > >end; > > > >-----Original Message----- > >From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of bbser > >2009 > >Sent: Monday, December 27, 2010 3:42 PM > >To: SAS-L@LISTSERV.UGA.EDU > >Subject: Re: Is there an easier way to solve this? > > > >Got typos in the code. Here is the new one. > >------------------ > >How about this below? It looks "elementary" for newbies like me and seemly > >works fine for whatever the longest USA city names. > > > >Max > > > >---------- > >data a; > > keep x zip state city; > > x="The City of New York Big Apple Big Apple NY 85044"; > > Length zip $ 5 state $ 2 city $ 30; > > array word{20} $ 15; > > do i=1 to 20; > > word[i]=scan(x,-i); > > end; > > zip=word[1]; > > state=word[2]; > > city=catx("", of word20-word3); > >run; > >proc print; > >run; > > > >-----Original Message----- > >From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of Nat > >Wooding > >Sent: December-27-10 1:16 PM > >To: SAS-L@LISTSERV.UGA.EDU > >Subject: Re: [SAS-L] Is there an easier way to solve this? > > > >Matt > > > >I took the liberty to send my reply to the list in case there are any bored > >Birdies listening in. > > > >I totally agree that SAS is not publicizing it as well as they could but I > >am generally unhappy with having the NLS formats and informats segregated > >from their peers, particularly since some are very useful. > > > >Did you notice the companion informat > > > >$REVERJw.@ inputs text right to left, preserves leading and trailing > >blanks > > ABCD | $reverj6. | ' DCBA' > > > >I copied this from TS486 and not the standard docs. > > > >Nat > >-----Original Message----- > >From: matt.pettis@thomsonreuters.com > [mailto:matt.pettis@thomsonreuters.com] > > > >Sent: Monday, December 27, 2010 12:43 PM > >To: nathani@VERIZON.NET > >Subject: RE: Is there an easier way to solve this? > > > >Thanks Nat! It is indeed a nice informat to keep in your back pocket... > >just think SAS shouldn't be hiding this light under a bushel... > > > >Thanks again, > >Matt > > > >-----Original Message----- > >From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of Nat > >Wooding > >Sent: Monday, December 27, 2010 11:38 AM > >To: SAS-L@LISTSERV.UGA.EDU > >Subject: Re: Is there an easier way to solve this? > > > >Matt > > > >Art and I spoke of this offline earlier today. 9.1.3 docs have an entry for > >it within the normal informats but refer you to the NLS docs. > > > >Art and I also corresponded about the width defaulting to 1. I find that if > >I have a length statement in the code, I do not need to supply a width. > > > >Nat > > > >-----Original Message----- > >From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of > Matthew > >Pettis > >Sent: Monday, December 27, 2010 12:28 PM > >To: SAS-L@LISTSERV.UGA.EDU > >Subject: Re: Is there an easier way to solve this? > > > >That '$revers.' informat wasn't documented in my local SAS Help files. I > >had to google it and found that it is a NLS informat. Anybody know why it > >wouldn't be in the base help files that come with my SAS install (and I > have > >9.2). > > > >Just curious, > >Thanks, > >matt > > > >-----Original Message----- > >From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of Arthur > >Tabachneck > >Sent: Monday, December 27, 2010 9:42 AM > >To: SAS-L@LISTSERV.UGA.EDU > >Subject: Re: Is there an easier way to solve this? > > > >Nat pointed out to me, offline, that I had left off the critical statement > >in Søren's code, namely the informat assignment: > > > > informat zip state city_soren $revers.; > > > >Yes, indeed, a VERY nice solution. > > > >Art > >-------- > >On Mon, 27 Dec 2010 10:05:50 -0500, Arthur Tabachneck <art297@ROGERS.COM> > >wrote: > > > >>Nat, > >> > >>Either you (or I) have had too much Excedrin or our systems are > functioning > >>differently (hmmm .. may they've had too much Excedrin). Running your > code > >>I get (what I expected), Søren's city 'k' for New York and Søren's city > >'a' > >>for Vienna. > >> > >>If you get something different, please email me a copy of the resulting > >>file. > >> > >>Art > >>-------- > >>On Mon, 27 Dec 2010 09:27:12 -0500, Nat Wooding <nathani@VERIZON.NET> > >wrote: > >> > >>>Art > >>> > >>>It looks to me that both solutions produce identical results. Try the > >>>following (which includes a merge sans by statement!!) > >>> > >>>Nat > >>> > >>> data soren; > >>> length firstname lastname address_SOREN $20 city_soren $40 state $2 zip > >>>$5; > >>> informat zip state city_soren $revers.; > >>> input firstname lastname / > >>> address_SOREN & / > >>> @; > >>> _infile_=reverse(_infile_); > >>> input zip state city_soren &; > >>> drop zip state firstname lastname; > >>>cards; > >>>Lee Athnos > >>>1215 Raintree Circle > >>>New York NY 85044 > >>>Heidie Baker > >>>1751 Diehl Road > >>>Vienna VA 22124 > >>>;run; > >>> > >>>data Art; > >>> Length City_art $ 40 > >>> State $ 2 > >>> Zip $ 5 > >>>; > >>>keep city_art address_art; > >>> infile cards ; > >>> input FirstName $ LastName $ / > >>> Address_art $ 1 - 20 / > >>> @; > >>> _infile_ = reverse(_infile_); > >>> input zip state city_art &; > >>> city_art=reverse(trim(city_art)); > >>> state=reverse(state); > >>> zip=reverse(zip); > >>> cards; > >>>Lee Athnos > >>>1215 Raintree Circle > >>>New York NY 85044 > >>>Heidie Baker > >>>1751 Diehl Road > >>>Vienna VA 22124 > >>>; > >>>run; > >>> > >>>Data Test; > >>>merge soren art; > >>>run; > >>> > >>> > >>>-----Original Message----- > >>>From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of > >Arthur > >>>Tabachneck > >>>Sent: Monday, December 27, 2010 9:16 AM > >>>To: SAS-L@LISTSERV.UGA.EDU > >>>Subject: Re: Is there an easier way to solve this? > >>> > >>>Søren, > >>> > >>>Definitely creates less of a dependency on Excedrin. However, it would > >>>require three more uses of the reverse function and, city appears strange > >>>unless one trims the leading(following?) space: > >>> > >>>data Address; > >>> Length City $ 40 > >>> State $ 2 > >>> Zip $ 5 > >>>; > >>> infile cards ; > >>> input FirstName $ LastName $ / > >>> Address $ 1 - 20 / > >>> @; > >>> _infile_ = reverse(_infile_); > >>> input zip state city &; > >>> city=reverse(trim(city)); > >>> state=reverse(state); > >>> zip=reverse(zip); > >>> cards; > >>>Lee Athnos > >>>1215 Raintree Circle > >>>New York NY 85044 > >>>Heidie Baker > >>>1751 Diehl Road > >>>Vienna VA 22124 > >>>; > >>>run; > >>> > >>>But, definitely a nice way to accomplish the task. > >>> > >>>Art > >>>-------- > >>>On Mon, 27 Dec 2010 01:11:50 -0500, S=?ISO-8859-1?Q?=C3=B8ren?= Lassen > >>><s.lassen@POST.TELE.DK> wrote: > >>> > >>>>Art, > >>>>How about this: > >>>>data address; > >>>> length firstname lastname address $20 city $40 state $2 zip $5; > >>>> informat zip state city $revers.; > >>>> input firstname lastname / > >>>> Address & / > >>>> @; > >>>> _infile_=reverse(_infile_); > >>>> input zip state city &; > >>>>cards; > >>>>John Doe > >>>>33 10 Av. > >>>>Uptown, Smalltown XX 12345 > >>>>;run; > >>>> > >>>>Regards, > >>>>Søren > >>>> > >>>>On Sun, 26 Dec 2010 11:07:11 -0500, Arthur Tabachneck > <art297@ROGERS.COM> > >>>>wrote: > >>>> > >>>>>The following was a question that was raised on the SAS discussion > >forum. > >>>>>You are confronted with data that has 3 lines per subject, but the > third > >>>>>line has variables that may contain embedded spaces, but there is only > >>one > >>>>>space between variables. > >>>>> > >>>>>The only suggestion I could think of was the one shown below. Is there > >>an > >>>>>easier way? > >>>>> > >>>>>data work.Address (drop=_:); > >>>>> infile cards; > >>>>> input FirstName $ LastName $ / > >>>>> Address $ 1 - 20 / > >>>>> _Third_Line & $80.; > >>>>> format City $10.; > >>>>> Zip=scan(_Third_Line,-1); > >>>>> State=scan(_Third_Line,-2); > >>>>> call scan(_Third_Line, -2, _position, _length); > >>>>> City=substr(_Third_Line,1,_position-1); > >>>>> cards; > >>>>>Lee Athnos > >>>>>1215 Raintree Circle > >>>>>New York NY 85044 > >>>>>Heidie Baker > >>>>>1751 Diehl Road > >>>>>Vienna VA 22124 > >>>>>; > >>>>> > >>>>>Thanks in advance, > >>>>>Art


Back to: Top of message | Previous page | Main SAS-L page