```Date: Wed, 1 Apr 2009 22:08:47 -0400 Reply-To: Ya Huang Sender: "SAS(r) Discussion" From: Ya Huang Subject: Re: Variable definiton It looks to me that your rule is for each racenum, therefore, the code has to be adjusted as: .... where a.racenum=b.racenum and a.Horse ^= b.Horse group by a.racenum, a.Horse ; and the result will be: Fraction Fraction avgSR BetterThan WorseThan ----------------------------------- 18 0.75 0.25 25 1 0 13 0.25 0.5 7 0 1 15 0.25 0.5 29 1 0 8 0 0.833333 25 0.833333 0.166667 13 0.333333 0.5 9 0 0.833333 12 0.333333 0.5 21 0.666667 0.333333 18 0.666667 0.333333 28 1 0 15 0.333333 0.666667 6 0 1 This matches what you said below for the first horse. On Wed, 1 Apr 2009 19:45:44 -0400, Ya Huang wrote: >Not sure if I understand your requirement correctly. Based on my >interpretation, I have the followin code and the result. Please >let us know if it's what you expected. > >proc sql; >select distinct a.*, > count(distinct case when b.avgSR <= a.avgSR - 3 then > b.Horse else '' end)/(a.Field - 1) as FractionBetterThan, > count(distinct case when b.avgSR >= a.avgSR + 3 then > b.Horse else '' end)/(a.Field - 1) as FractionWorseThan >from hr a, hr b >where a.Horse ^= b.Horse >group by a.Horse >; > > Fraction Fraction > date track racenum Horse Field avgSR BetterThan WorseThan > -------------------------------------------------------------------- > 20040514 AP 1 A 5 18 2.25 1.25 > 20040514 AP 1 B 5 25 3 0.5 > 20040514 AP 1 C 5 13 1 1.75 > 20040514 AP 1 D 5 7 0 3 > 20040514 AP 1 E 5 15 1.25 1.75 > 20040514 AP 2 F 7 29 2.333333 0 > 20040514 AP 2 G 7 8 0 2 > 20040514 AP 2 H 7 25 2 0.333333 > 20040514 AP 2 I 7 13 0.666667 1.166667 > 20040514 AP 2 J 7 9 0.166667 2 > 20040514 AP 2 K 7 12 0.666667 1.5 > 20040514 AP 2 L 7 21 1.833333 0.666667 > 20040514 AP 3 M 4 18 3 1.666667 > 20040514 AP 3 N 4 28 4.666667 0 > 20040514 AP 3 O 4 15 1.666667 2.333333 > 20040514 AP 3 P 4 6 0 4.333333 > >On Wed, 1 Apr 2009 19:01:16 -0400, Michael Bryce Herrington > wrote: > >>Hey, >> >>I have a difficult variable definition problem I hope you all could help me >>with. >> >>I have data that looks like this: >> >>data hr; >>input date track \$ racenum Horsenameabrev \$ FieldSize avgSR; >>datalines; >>20040514 AP 1 A 5 18 >>20040514 AP 1 B 5 25 >>20040514 AP 1 C 5 13 >>20040514 AP 1 D 5 7 >>20040514 AP 1 E 5 15 >>20040514 AP 2 F 7 29 >>20040514 AP 2 G 7 8 >>20040514 AP 2 H 7 25 >>20040514 AP 2 I 7 13 >>20040514 AP 2 J 7 9 >>20040514 AP 2 K 7 12 >>20040514 AP 2 L 7 21 >>20040514 AP 3 M 4 18 >>20040514 AP 3 N 4 28 >>20040514 AP 3 O 4 15 >>20040514 AP 3 P 4 6 >>; >> >>I would like to make two new variables, FractionBetterThan and >>FractionWorseThan based on the avgSR variable. I would like these to be >>defined like so: >> >>FractionBetterThan = (Number of horses with avgSR <= (current horse's >>avgSR-3)) / (FieldSize-1) >> FractionWorseThan = (Number of horses with avgSR >= (current horse's >>avgSR+3)) / (FieldSize-1) >> >> >>i.e. For the first horse we would say there are 3 horses with avgSR <= its >>avgSR-3 so FractionBetterThan=3/4=.75 >>Similarly there is 1 horse with avgSR >= its avgSR+3 so >>FractionWorseThan=1/4=.25 >> >>We cannot use FractionWorseThan=1-FractionBetterThan because there may be >>some horses within the (avgSR-3, avgSR+3) interval. >> >>Thanks for any help you can give me. >> >> >>-- >>Bryce Herrington >>Clemson University ```

Back to: Top of message | Previous page | Main SAS-L page