LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (June 2005, week 3)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Tue, 21 Jun 2005 18:29:41 -0400
Reply-To:   "Howard Schreier <hs AT dc-sug DOT org>" <nospam@HOWLES.COM>
Sender:   "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:   "Howard Schreier <hs AT dc-sug DOT org>" <nospam@HOWLES.COM>
Subject:   Re: Hwo to calculate the sum of cross-product of all observations in each group?

I think it can be simpler.

The need for the KEEP= can be eliminated by specifying only the wanted columns in the SELECT list.

Then, since DATE is the grouping level and CROSS is a summary stat, SQL will automatically produce just one row per group. So the HAVING is not needed.

I assume just one row per stock per day. In that case there is no need for the intervening DATA step which generates sequence numbers.

That leaves

create table cross_product as select a.date, sum(a.return*b.return) as cross from xx a, xx b where a.date=b.date and a.stock < b.stock group by a.date ;

As to performance, Toby has a good point. I generated a 1Kx1K test table:

data xx; do stock = 1 to 1e3; do date = 1 to 1e3; return = ranuni(1); output; end; end; run;

The SQL solution took 11+ minutes on my system.

As an alternative, I sorted by DATE (that took 3 seconds), then ran this DATA step:

data cross; array a_return(9999) _temporary_; cross = 0; do stocknum = 1 by 1 until (last.date); set xx; by date; a_return(stocknum) = return; if stocknum>1 then do j = 1 to (stocknum-1); cross + return * a_return(j); end; end; keep date cross; run;

It took a bit over 1 minute and produced the same numbers as SQL. So the reduction was about 90%.

On Tue, 21 Jun 2005 17:39:12 +0000, toby dunn <tobydunn@HOTMAIL.COM> wrote:

>Ya, > >Three things, > >1.) The cartesian product created by SQL is going to be a bear with a lot of >data. But it was the same Idea I had, don't know if there is a less memory >intensive way to do this. > >2.) Need to only keep Date and Cross vars. > >3.) Lastly the solution only needs on obs per Date. > >proc sql; >create table cross_product (keep = Date Cross) as > select a.*,b.return, sum(a.return*b.return) as cross > from xx a, xx b > where a.date=b.date and a.n < b.n > group a.date > having monotonic() = min(monotonic()) > order by a.date,a.n >; >quit ; > > >Should do it. > > > > >Toby Dunn > > > > >From: Ya Huang <ya.huang@AMYLIN.COM> >Reply-To: Ya Huang <ya.huang@AMYLIN.COM> >To: SAS-L@LISTSERV.UGA.EDU >Subject: Re: Hwo to calculate the sum of cross-product of all observations > in each group? >Date: Tue, 21 Jun 2005 13:31:51 -0400 >Received: from malibu.cc.uga.edu ([128.192.1.103]) by mc5-f31.hotmail.com >with Microsoft SMTPSVC(6.0.3790.211); Tue, 21 Jun 2005 10:31:59 -0700 >Received: from listserv.cc.uga.edu (128.192.1.75) by malibu.cc.uga.edu >(LSMTP for Windows NT v1.1b) with SMTP id <0.00FD289E@malibu.cc.uga.edu>; >Tue, 21 Jun 2005 13:31:54 -0400 >Received: from LISTSERV.UGA.EDU by LISTSERV.UGA.EDU (LISTSERV-TCP/IP release > 1.8d) with spool id 3725305 for SAS-L@LISTSERV.UGA.EDU; Tue, 21 >Jun 2005 13:31:54 -0400 >Received: from malibu.cc.uga.edu (malibu.cc.uga.edu [128.192.1.103]) by > listserv.cc.uga.edu (8.12.11/8.12.11) with ESMTP id j5LHVp1l020471 > for <SAS-L@LISTSERV.UGA.EDU>; Tue, 21 Jun 2005 13:31:53 -0400 >Received: from listserv.cc.uga.edu (128.192.1.75) by malibu.cc.uga.edu >(LSMTP for Windows NT v1.1b) with SMTP id ><0.00FD2895@malibu.cc.uga.edu>; Tue, 21 Jun 2005 13:31:50 -0400 >X-Message-Info: g+YnDTwJTP7NTx9/VqYPG3Y+uq+rEUZyREbMII8peYY= >Comments: To: shenyiyu@GMAIL.COM >Return-Path: owner-sas-l@LISTSERV.UGA.EDU >X-OriginalArrivalTime: 21 Jun 2005 17:31:59.0717 (UTC) >FILETIME=[210E3950:01C57687] > >This one seems what you want: > >data xx; >input Date Stock Return; >cards; >199001 10001 1.5 >199001 10002 2.4 >199001 10008 0.8 >199001 10006 1.4 >199001 10005 0.1 >199002 10005 1.5 >199002 10001 0.6 >199002 10014 1.1 >199002 10006 1.2 >199002 10034 3.4 >199002 10004 1.9 >; > >data xx; > set xx; >by date; >if first.date then n=0; >n+1; >run; > >options ps=999; >proc sql; >select a.*,b.return, sum(a.return*b.return) as cross >from xx a, xx b >where a.date=b.date and a.n < b.n >group a.date >order by a.date,a.n >; > > Date Stock Return n Return cross >---------------------------------------------------------- > 199001 10001 1.5 1 0.8 13.91 > 199001 10001 1.5 1 2.4 13.91 > 199001 10001 1.5 1 1.4 13.91 > 199001 10001 1.5 1 0.1 13.91 > 199001 10002 2.4 2 0.8 13.91 > 199001 10002 2.4 2 0.1 13.91 > 199001 10002 2.4 2 1.4 13.91 > 199001 10008 0.8 3 0.1 13.91 > 199001 10008 0.8 3 1.4 13.91 > 199001 10006 1.4 4 0.1 13.91 > 199002 10005 1.5 1 1.1 36.83 > 199002 10005 1.5 1 1.2 36.83 > 199002 10005 1.5 1 1.9 36.83 > 199002 10005 1.5 1 3.4 36.83 > 199002 10005 1.5 1 0.6 36.83 > 199002 10001 0.6 2 1.1 36.83 > 199002 10001 0.6 2 1.2 36.83 > 199002 10001 0.6 2 1.9 36.83 > 199002 10001 0.6 2 3.4 36.83 > 199002 10014 1.1 3 1.2 36.83 > 199002 10014 1.1 3 3.4 36.83 > 199002 10014 1.1 3 1.9 36.83 > 199002 10006 1.2 4 3.4 36.83 > 199002 10006 1.2 4 1.9 36.83 > 199002 10034 3.4 5 1.9 36.83 > >Kind regards, > >Ya Huang > >On Tue, 21 Jun 2005 10:02:24 -0700, Yiyu <shenyiyu@GMAIL.COM> wrote: > > >The data is as following: for each date there are some stocks, and > >their returns, My questions is : for each date, I cross-product all > >stock returns with each other and sum them together. How to do this > >without resorting to macros? > > > >For example > > > >Date Stock Return > >199001 10001 1.5 > >199001 10002 2.4 > >199001 10008 0.8 > >199001 10006 1.4 > >199001 10005 0.1 > > > > > >199002 10005 1.5 > >199002 10001 0.6 > >199002 10014 1.1 > >199002 10006 1.2 > >199002 10034 3.4 > >199002 10004 1.9 > > > >199003 ...... .... > > > > > >and I want to get something like: > > > >Date Sum_Cross_Prod > >199001 Sum1 > >199002 Sum2 > > > >where Sum1 = > >1.5*2.4+1.5*2.4+1.5*0.8+1.5*1.4+1.5*0.1+2.4*0.8+2.4*1.4+2.4*0.1+0.8*1.4+0.8 >*0.1+1.4*0.1 > > > >so on. > > > >Thank you very much. > > > >Yiyu


Back to: Top of message | Previous page | Main SAS-L page