Date: Tue, 13 Mar 2007 13:46:52 +0000
Reply-To: toby dunn <tobydunn@HOTMAIL.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: toby dunn <tobydunn@HOTMAIL.COM>
Subject: Re: Tall & Skinny, Short & Wide, or what ..
Content-Type: text/plain; format=flowed
Well sounds like the data structure you are asking for is short and wide.
Drop the months from the variable names and do something like:
Phone# Month BillAmount
You should end up with a first normal form with redundancies but the
redundacies here are okay since you are not duplicating entire records.
Once you have that data structure I dont think there should be any problems
doing a update
To sensible men, every day is a day of reckoning. ~John W. Gardner
The important thing is this: To be able at any moment to sacrifice that
which we are for what we could become. ~Charles DuBois
Don't get your knickers in a knot. Nothing is solved and it just makes you
walk funny. ~Kathryn Carpenter
From: "Bernard S. Zimmerman" <bernie@HOMER.ATT.COM>
Reply-To: "Bernard S. Zimmerman" <bernie@HOMER.ATT.COM>
Subject: Tall & Skinny, Short & Wide, or what ..
Date: Tue, 13 Mar 2007 08:40:44 -0400
I come to you once again, in need of advice.
Here is my problem (opportunity). We have a monthly file that profiles
our customers. It consists of about 7 - 8 million records with 300+
variables. (So far so good). One of the variables is called
"disconnect indicator" which contains a code for why or if a customer's
service has been disconnected, in the past 12 months. In addition, we
have been given a 'driver' file of roughly 25K phone numbers of
customers, disconnected because they say the service costs too much.
The driver file contains a transaction disconnect date along with 5 or
so other variables we'd like to keep.
What we need to do is match the driver file to the monthly file by 10
digit phone number and keep roughly 100 or so of the 300+ variables
mentioned earlier. So far so good except .. if/when I match either with
a data step MERGE or Proc SQL (where I could use lots of help) it seems
I would end up with a short and wide dataset, which is supposed to be
bad for further processing. In addition, what we need to come up with
is a way to string out variables so that a record contains for example:
phone number, var2, var3, var4, var5, var6 from the driver file along
with -- bill amount - month1, bill amount - month2, ... bill amount
month12, from the big dataset.
Would someone be kind enough to share any ideas you might have,
pseudocode or code snippets to help me solve this assignment?
Of course, many thanks in advance.
Find a local pizza place, movie theater, and more….then map the best route!