Date: Thu, 13 Nov 2003 15:10:16 -0500 Howard Schreier "SAS(r) Discussion" Howard Schreier Re: Agggregating Paired Variables - Help Needed text/plain; charset=US-ASCII

This is one of those problems which is easier to solve if broken down into smaller steps.

1. Compute the aggregate counts for each party for each day.

create table agg as select *, count(*) as agg from (select buyer as party, mon, day from sample outer union corresponding select seller as party, mon, day from sample) group by party, mon, day;

2. Since the direction of the trades (who buys and who sells) does not matter, standardize the dataset by identifying the party with the lower ID as A and the other as B.

create table minAmaxB as select monotonic() as origseq, min(buyer,seller) as partyA, max(buyer,seller) as partyB, * from sample;

The undocumented MONOTONIC function allows capture of the original sequence. It is only needed if it's necessary to preserve that sequence in the end result.

3. Combine the two intermediate results.

create table all as select minAmaxB.*, aggA.agg as aggA, aggB.agg as aggB from minAmaxB, agg as aggA, agg as aggB where minAmaxB.partyA=aggA.party and minAmaxB.partyB=aggB.party and minAmaxB.mon =aggA.mon and minAmaxB.mon =aggB.mon and minAmaxB.day =aggA.day and minAmaxB.day =aggB.day;

4. Derive the two new columns.

create table result as select buyer, seller, mon, day, count(*) as PairedTrade, calculated PairedTrade / (aggA + aggB) format=5.2 as Prop_trade from all group by partyA, partyB, mon, day order by origseq;

This solution double counts the paired trades in the denominator, thus conforming to the example. To avoid that, subtract the PairedTrade count in the denominator subexpression.

On Mon, 10 Nov 2003 14:23:18 -0800, Dorian Noel <d.noel@ISMACENTRE.RDG.AC.UK> wrote: