| Date: | Tue, 16 Sep 2008 06:38:17 -0700 |
| Reply-To: | Akshaya <akshaya.nathilvar@GMAIL.COM> |
| Sender: | "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU> |
| From: | Akshaya <akshaya.nathilvar@GMAIL.COM> |
| Organization: | http://groups.google.com |
| Subject: | Which is more efficient: Inner join or a non-correlated subquery
or Hash Object or Data step Merge? |
|
| Content-Type: | text/plain; charset=windows-1252 |
Hello all,
I have a SAS dataset (Master) with billions of records. I have another
cohort (Subset) dataset with thousands of records. I have to select
the records from Master dataset based on ID’s from Subset dataset.
Master dataset have multiple columns, Subset dataset has only one
Column (ID). I’m wondering, what would be the “efficient” way (less
CPU time probably) to do this? Both datasets are sorted by ID
variable.
Any help is much appreciated.
Thanks!
Akshaya
|