Date: Tue, 21 Jun 2005 18:29:41 -0400 "Howard Schreier " "SAS(r) Discussion" "Howard Schreier " Re: Hwo to calculate the sum of cross-product of all observations in each group?

I think it can be simpler.

The need for the KEEP= can be eliminated by specifying only the wanted columns in the SELECT list.

Then, since DATE is the grouping level and CROSS is a summary stat, SQL will automatically produce just one row per group. So the HAVING is not needed.

I assume just one row per stock per day. In that case there is no need for the intervening DATA step which generates sequence numbers.

That leaves

create table cross_product as select a.date, sum(a.return*b.return) as cross from xx a, xx b where a.date=b.date and a.stock < b.stock group by a.date ;

As to performance, Toby has a good point. I generated a 1Kx1K test table:

data xx; do stock = 1 to 1e3; do date = 1 to 1e3; return = ranuni(1); output; end; end; run;

The SQL solution took 11+ minutes on my system.

As an alternative, I sorted by DATE (that took 3 seconds), then ran this DATA step:

data cross; array a_return(9999) _temporary_; cross = 0; do stocknum = 1 by 1 until (last.date); set xx; by date; a_return(stocknum) = return; if stocknum>1 then do j = 1 to (stocknum-1); cross + return * a_return(j); end; end; keep date cross; run;

It took a bit over 1 minute and produced the same numbers as SQL. So the reduction was about 90%.

On Tue, 21 Jun 2005 17:39:12 +0000, toby dunn <tobydunn@HOTMAIL.COM> wrote:

>Ya, > >Three things, > >1.) The cartesian product created by SQL is going to be a bear with a lot of >data. But it was the same Idea I had, don't know if there is a less memory >intensive way to do this. > >2.) Need to only keep Date and Cross vars. > >3.) Lastly the solution only needs on obs per Date. > >proc sql; >create table cross_product (keep = Date Cross) as > select a.*,b.return, sum(a.return*b.return) as cross > from xx a, xx b > where a.date=b.date and a.n < b.n > group a.date > having monotonic() = min(monotonic()) > order by a.date,a.n >; >quit ; > > >Should do it. > > > > >Toby Dunn > > > > >From: Ya Huang <ya.huang@AMYLIN.COM> >Reply-To: Ya Huang <ya.huang@AMYLIN.COM> >To: SAS-L@LISTSERV.UGA.EDU >Subject: Re: Hwo to calculate the sum of cross-product of all observations > in each group? >Date: Tue, 21 Jun 2005 13:31:51 -0400 >Received: from malibu.cc.uga.edu ([128.192.1.103]) by mc5-f31.hotmail.com >with Microsoft SMTPSVC(6.0.3790.211); Tue, 21 Jun 2005 10:31:59 -0700 >Received: from listserv.cc.uga.edu (128.192.1.75) by malibu.cc.uga.edu >(LSMTP for Windows NT v1.1b) with SMTP id <0.00FD289E@malibu.cc.uga.edu>; >Tue, 21 Jun 2005 13:31:54 -0400 >Received: from LISTSERV.UGA.EDU by LISTSERV.UGA.EDU (LISTSERV-TCP/IP release > 1.8d) with spool id 3725305 for SAS-L@LISTSERV.UGA.EDU; Tue, 21 >Jun 2005 13:31:54 -0400 >Received: from malibu.cc.uga.edu (malibu.cc.uga.edu [128.192.1.103]) by > listserv.cc.uga.edu (8.12.11/8.12.11) with ESMTP id j5LHVp1l020471 > for <SAS-L@LISTSERV.UGA.EDU>; Tue, 21 Jun 2005 13:31:53 -0400 >Received: from listserv.cc.uga.edu (128.192.1.75) by malibu.cc.uga.edu >(LSMTP for Windows NT v1.1b) with SMTP id ><0.00FD2895@malibu.cc.uga.edu>; Tue, 21 Jun 2005 13:31:50 -0400 >X-Message-Info: g+YnDTwJTP7NTx9/VqYPG3Y+uq+rEUZyREbMII8peYY= >Comments: To: shenyiyu@GMAIL.COM >Return-Path: owner-sas-l@LISTSERV.UGA.EDU >X-OriginalArrivalTime: 21 Jun 2005 17:31:59.0717 (UTC) >FILETIME=[210E3950:01C57687] > >This one seems what you want: > >data xx; >input Date Stock Return; >cards; >199001 10001 1.5 >199001 10002 2.4 >199001 10008 0.8 >199001 10006 1.4 >199001 10005 0.1 >199002 10005 1.5 >199002 10001 0.6 >199002 10014 1.1 >199002 10006 1.2 >199002 10034 3.4 >199002 10004 1.9 >; > >data xx; > set xx; >by date; >if first.date then n=0; >n+1; >run; > >options ps=999; >proc sql; >select a.*,b.return, sum(a.return*b.return) as cross >from xx a, xx b >where a.date=b.date and a.n < b.n >group a.date >order by a.date,a.n >; > > Date Stock Return n Return cross >---------------------------------------------------------- > 199001 10001 1.5 1 0.8 13.91 > 199001 10001 1.5 1 2.4 13.91 > 199001 10001 1.5 1 1.4 13.91 > 199001 10001 1.5 1 0.1 13.91 > 199001 10002 2.4 2 0.8 13.91 > 199001 10002 2.4 2 0.1 13.91 > 199001 10002 2.4 2 1.4 13.91 > 199001 10008 0.8 3 0.1 13.91 > 199001 10008 0.8 3 1.4 13.91 > 199001 10006 1.4 4 0.1 13.91 > 199002 10005 1.5 1 1.1 36.83 > 199002 10005 1.5 1 1.2 36.83 > 199002 10005 1.5 1 1.9 36.83 > 199002 10005 1.5 1 3.4 36.83 > 199002 10005 1.5 1 0.6 36.83 > 199002 10001 0.6 2 1.1 36.83 > 199002 10001 0.6 2 1.2 36.83 > 199002 10001 0.6 2 1.9 36.83 > 199002 10001 0.6 2 3.4 36.83 > 199002 10014 1.1 3 1.2 36.83 > 199002 10014 1.1 3 3.4 36.83 > 199002 10014 1.1 3 1.9 36.83 > 199002 10006 1.2 4 3.4 36.83 > 199002 10006 1.2 4 1.9 36.83 > 199002 10034 3.4 5 1.9 36.83 > >Kind regards, > >Ya Huang > >On Tue, 21 Jun 2005 10:02:24 -0700, Yiyu <shenyiyu@GMAIL.COM> wrote: > > >The data is as following: for each date there are some stocks, and > >their returns, My questions is : for each date, I cross-product all > >stock returns with each other and sum them together. How to do this > >without resorting to macros? > > > >For example > > > >Date Stock Return > >199001 10001 1.5 > >199001 10002 2.4 > >199001 10008 0.8 > >199001 10006 1.4 > >199001 10005 0.1 > > > > > >199002 10005 1.5 > >199002 10001 0.6 > >199002 10014 1.1 > >199002 10006 1.2 > >199002 10034 3.4 > >199002 10004 1.9 > > > >199003 ...... .... > > > > > >and I want to get something like: > > > >Date Sum_Cross_Prod > >199001 Sum1 > >199002 Sum2 > > > >where Sum1 = > >1.5*2.4+1.5*2.4+1.5*0.8+1.5*1.4+1.5*0.1+2.4*0.8+2.4*1.4+2.4*0.1+0.8*1.4+0.8 >*0.1+1.4*0.1 > > > >so on. > > > >Thank you very much. > > > >Yiyu

Back to: Top of message | Previous page | Main SAS-L page