|
Hi, Tony,
I'm not sure that I understood what you want to do. For example,
according to your code, when you calculate the average for this obs:
301020 2006 1 2 2 100
you only calculate the obs as follows:
301020 2006 1 1 1 91
301020 2006 1 1 3 100
301020 2006 1 2 2 100
don't include this obs:
301020 2006 1 1 7 100
My understanding is:
data zeta1;
set zeta;
key = period * 10000 + week*100 +day;
group_key=COST||'-'||put(YEAR,best.)||'-'||put(PERIOD,best.)||'-'||put(WEEK,best.)||'-'||put(DAY,best.);
run;
PROC SQL _METHOD BUFFERSIZE=1000000;
CREATE TABLE AGGREGATE AS
SELECT A.COST
,A.YEAR
,A.PERIOD
,A.WEEK
,A.DAY
,AVG(B.SCORE) AS AVG_SCORE
FROM ZETA A,
ZETA B
WHERE A.YEAR = &CURRENT_YEAR
AND B.YEAR = A.YEAR
AND b.key<=a.key
GROUP BY group_key;
QUIT;
regards,
John Xie
tony.gallegly@gmail.com wrote:
> SAS-L,
>
> I am looking for a better way from a performance and memory usage
> standpoint to accomplish the following.
>
> I need to calculate a year to date average for each observation. The
> challenge is I have a minimum of 100,000 observations.
>
> Existing Code
> PROC SQL _METHOD BUFFERSIZE=1000000;
> CREATE TABLE AGGREGATE AS
> SELECT A.COST
> ,A.YEAR
> ,A.PERIOD
> ,A.WEEK
> ,A.DAY
> ,AVG(B.SCORE) AS AVG_SCORE
> FROM ZETA A,
> ZETA B
> WHERE A.YEAR = &CURRENT_YEAR
> AND B.YEAR = A.YEAR
> AND B.PERIOD <= A.PERIOD
> AND B.WEEK <= A.WEEK
> AND B.DAY <= A.DAY
> GROUP BY A.COST, A.YEAR, A.PERIOD, A.WEEK, A.DAY;
> QUIT;
>
> Sample Data: The avg_score field is the result of the above sql
> statement.
> COST YEAR PERIOD WEEK DAY SCORE AVG_SCORE
> 301020 2006 1 1 1 91 91.00
> 301020 2006 1 1 3 100 95.50
> 301020 2006 1 1 7 100 97.00
> 301020 2006 1 2 2 100 97.75
> 301020 2006 1 2 4 100 98.20
> 301020 2006 1 2 4 100 98.50
> 301020 2006 1 2 6 100 98.71
> 301020 2006 1 3 3 100 98.88
> 301020 2006 1 3 5 92 98.11
> 301020 2006 1 4 2 100 98.30
> 301020 2006 1 4 3 100 98.45
> 301020 2006 2 1 1 100 98.58
> 301020 2006 2 1 3 100 98.69
> 301020 2006 2 1 5 100 98.79
> 301020 2006 2 2 2 100 98.87
> 301020 2006 2 2 2 100 98.94
> 301020 2006 2 2 4 100 99.00
> 301020 2006 2 3 1 100 99.06
> 301020 2006 2 3 1 90 98.58
> 301020 2006 2 3 5 100 98.65
> 301020 2006 2 4 2 100 98.71
> 301020 2006 2 4 4 100 98.77
> 301020 2006 2 4 6 92 98.48
> 301020 2006 3 1 2 82 97.79
> 301020 2006 3 1 2 100 97.88
>
> Thanks in advance,
> Tony
|