Date: Mon, 9 Oct 2006 16:03:21 -0400
Reply-To: Raj G <rajasekhargo@YAHOO.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Raj G <rajasekhargo@YAHOO.COM>
Subject: Dataset design question
To my previous posting (http://listserv.uga.edu/cgi-bin/wa?
A2=ind0610A&L=sas-l&P=R43676), I got several responses asking me to
reconsider my decision to transform a dataset that I originally have. So I
am rephrasing my question this time to request inputs on what is the best
way to design a dataset(s) to perform what I intend to do.
Here is the scenario.
The input dataset has a list of securities with their IDs (ticker symbol)
and several fields for each date. Information for each security is
appended every day to this table, so the table keeps growing vertically.
Security date price earnings returns
AAA 20061009 3 5 0.4
BBB 20061009 8 . 0.2
CCC 20061009 2 6 0.7
AAA 20061008 4 8 .
BBB 20061008 9 0 0.1
CCC 20061008 0 2 0.3
AAA 20061005 2 . 0.4
BBB 20061005 8 6 0.7
CCC 20061005 . 4 0.5
Typical tasks that I need to perform are time-series and cross-sectional
analysis. For example, perform linear regressions between returns of some
3000 stocks against a benckmark's returns over the last 200 days.
One solution to this might be to create a dataset with stock ID as the
column name and store the returns for those 200 days. Then run proc reg
for each stock against benchmark's(3000 times). Will this be a clumsy
design? Also, if I run proc reg 3000 times does SAS read the table 3000
times (meaning 3000 I/O operations)? It also takes some processing time to
create a dataset like this from my input dataset.
How can I make this operation faster (by design or code changes)?
I appreciate your resposes.
Raj