Date: Thu, 17 Aug 2006 13:49:13 -0400
Reply-To: "Howard Schreier <hs AT dc-sug DOT org>" <nospam@HOWLES.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "Howard Schreier <hs AT dc-sug DOT org>" <nospam@HOWLES.COM>
Subject: Re: Is hashing the right approach for this table lookup problem?
On Thu, 17 Aug 2006 13:17:57 -0400, Peter Constantinidis
<peter@CONSTANTINIDIS.CA> wrote:
>I am thrilled at how good everyone on this list is and willing to
>share. Thank you so much to everyone and Paul!
>
>I tried Paul's code today, and it works, but I think I failed to think
>something through fundamental in how I am looking at the data and his
>code with the array made me realize it.
>
>We have observations, and each observation has a loop done on it.
>
>The problem is, the find function is searching for a key given to it.
>It is true it is searching right to left as it should be, but it's
>searching wrong.
>
>It is doing exactly what it's told to do. Namely if the first code in
>the array is 50000 and there's a 5 digit sequence somewhere in the
>string matching that, then it returns true even if it's at the
>beginning of the string rather than from 'near the end' and then moves
>on to another observation because it's got what it wants, a true
>string.
>
>I cannot easily just slice off the end and 'be done' because some of
>the data entry is sloppy. Not a lot, but some.
>
>What I really need to be doing is somehow have a way to tell it to not
>return a true and move on to another observation until each and every
>code has been checked against those first 5 digits the find function
>is currently looking at.
>
>Pictorally it currently checks the entirety of:
>##################### for ##### and then returns a true, rather than just
>------------------------##### for every combination of ##### before
>incrementing to
>-----------------------#####- for every combination of ##### or
>--------------#####---------- for every combination of #####
>
>I'm not an expert at this, but I do try to learn.
>
>Do you think the right approach here is to read the length of the
>string into a variable, then substr the end 5 digits, then run the
>loop and if no match, then substr again -1 on the length, and repeat?
>
>Thanks!
I'm not exactly following this explanation, but the approach I suggested
earlier should turn up every match. I don't see why it matters whether the
process progresses left to right or right to left.
|