| Date: | Fri, 16 Jun 2006 11:00:35 -0400 |
| Reply-To: | Wensui Liu <liuwensui@GMAIL.COM> |
| Sender: | "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU> |
| From: | Wensui Liu <liuwensui@GMAIL.COM> |
| Subject: | Re: Percentage of subgroups |
|
| In-Reply-To: | <BAY101-F6275884DECE5585DCE3E8DE830@phx.gbl> |
| Content-Type: | text/plain; charset=ISO-8859-1; format=flowed |
Well,
I think Maarten is the best judge here.
On 6/16/06, toby dunn <tobydunn@hotmail.com> wrote:
> Wensui ,
>
> A few things strike me when I read your response.
>
> >>Well, the good side of my way is that you won't include the funky year you
> >>don't want.
>
> I dont see this a good way to handle that. A where clause is more
> appropriate solution to a 'Funky Year Problem'. Lets say for the moment I
> was using your SQL solution, and lets say I have 50 years worth of data and
> only one 'Funky Year Value'. I would have to copy and paste 100 lines of
> code just to avoid that one 'Funky Year' Value. Now lets say I need to
> rerun this again with 100 years worth of data, the code has to increase some
> more.
>
> You see the problem isnt that funky year it should have been taken care of
> before you started using the data set. The problem is the lack of
> generality of the codes design to handle the problem. A good examle is the
> set of programs I am finishing working on right now. The original set of
> programs was around 20 to 25. Running some 40 or 50 thousand lines of
> code. I rewrote the code in such a way as to concentrate on the business
> rules instead of specific values. The end result was that the whole set of
> programs went down to only 5 programs. Where it originally took 2 weeks to
> make a run and check the results I now do it in 2 hours. By concentrating
> on the business rules we found many errors that were present in the code
> because I and the code reviewers were able to concentrat on the rules that
> should govern how things were suppose to be done rather than the all the
> values that where in the data set. Not only that but we have now expanded
> the program suite to include full data profiles, create quality control
> programs and data sets. Also, the general set up has allowed us to turn
> around ad hoc requests in a few days rather than a few weeks.
>
> Now this new found speed and agility has made the bosses happy. They brag
> about their abiity to turn around requests, and well other people not on my
> project have started writing code in this manner which means that their turn
> around time has gone down as well. In general it has had a pretty darn good
> effect on the team as a whole.
>
> I think once you start coding in a more business rule oriented fashion you
> will see the power and advantages that it has over a more data specific
> approach.
>
>
> Toby Dunn
>
>
>
>
>
> From: Wensui Liu <liuwensui@GMAIL.COM>
> Reply-To: Wensui Liu <liuwensui@GMAIL.COM>
> To: SAS-L@LISTSERV.UGA.EDU
> Subject: Re: Percentage of subgroups
> Date: Fri, 16 Jun 2006 09:09:37 -0400
>
> Well, the good side of my way is that you won't include the funky year
> you don't want. It is not surprise to have year "2049" in many
> databases.
>
> On 6/16/06, data _null_; <datanull@gmail.com> wrote:
> >When does an elegant solution involve mixing up the data with the code?
> >
> >What are you going to do when you get more years?
> >
> >On 6/15/06, Wensui Liu <liuwensui@gmail.com> wrote:
> > > Maarten,
> > >
> > > Here is a more elegent solution in SQL:
> > >
> > > proc sql;
> > > create table
> > > wensui as
> > > select
> > > company,
> > > sum(case when year = 2002 and delay ~= . then 1 else 0 end)/
> > > sum(case when year = 2002 then 1 else 0 end) as delay2002 format
> >= percent8.2,
> > > sum(case when year = 2003 and delay ~= . then 1 else 0 end)/
> > > sum(case when year = 2003 then 1 else 0 end) as delay2003 format
> >= percent8.2
> > > from
> > > test
> > > group by
> > > company;
> > > quit;
> > >
> > > On 6/15/06, Havik, Maarten <Maarten.Havik@ordina.nl> wrote:
> > > > Hi,
> > > >
> > > > Can someone please help me with a percentage question?
> > > > I've tried a lot, and searched SAS-L but can't find the right posts..
> > > > I have a dataset like this:
> > > >
> > > > data test;
> > > > input flight company $ year delay;
> > > > datalines;
> > > > 1 A 2002 13
> > > > 2 A 2002 .
> > > > 3 A 2003 5
> > > > 4 A 2003 12
> > > > 5 A 2003 .
> > > > 6 B 2002 17
> > > > 7 B 2002 28
> > > > 8 B 2003 32
> > > > 9 B 2003 .
> > > > 10 B 2003 .
> > > > 11 B 2003 .
> > > > ;
> > > >
> > > > A missing for delay indicates a flight was on time.
> > > > What i need is a report containing the percentage of delayed flights
> >for each subgroup company & year, like this:
> > > >
> > > > Year
> > > > Company 2002 2003
> > > > A 50% 66,66%
> > > > B 100% 25%
> > > >
> > > > Any help is appreciated!
> > > > Maarten
> > > >
> > > >
> > >
> > >
> > > --
> > > WenSui Liu
> > > (http://spaces.msn.com/statcompute/blog)
> > > Senior Decision Support Analyst
> > > Health Policy and Clinical Effectiveness
> > > Cincinnati Children Hospital Medical Center
> > >
> >
>
>
> --
> WenSui Liu
> (http://spaces.msn.com/statcompute/blog)
> Senior Decision Support Analyst
> Health Policy and Clinical Effectiveness
> Cincinnati Children Hospital Medical Center
>
>
>
--
WenSui Liu
(http://spaces.msn.com/statcompute/blog)
Senior Decision Support Analyst
Health Policy and Clinical Effectiveness
Cincinnati Children Hospital Medical Center
|