Date: Mon, 17 May 2004 13:44:02 -0400
Reply-To: Sigurd Hermansen <HERMANS1@WESTAT.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Sigurd Hermansen <HERMANS1@WESTAT.COM>
Subject: Re: sql subset
Content-Type: text/plain
Jian:
You are actually mixing a couple of DB access methods. Hopefully you will be
able to 'pass thru' a query containing a IN list, as suggested by Paul. The
CONNECT TO ORACLE processes the connection string, not the LIBNAME ...
ORACLE ... command. Depending on the version of SAS that you are using, you
may run into a limit on the number of items that you can include in an IN
list.
Basically the same question has appeared on SAS-L a number of times. It
would work best in general to create and insert your data into an Oracle
table before passing thru a JOIN query. Check with your Orable DBA for the
best way to do this. Here's a SAS-L archive reference to Harry D's post that
describes how to pass thru a table create to an Oracle DB.
Sig
-----Original Message-----
From: Chen, Jian [mailto:ozz6@CDC.GOV]
Sent: Monday, May 17, 2004 12:13 PM
To: SAS-L@LISTSERV.UGA.EDU
Subject: sql subset
Good afternoon, all!
Could anyone help me with the following:
I have a small sas dataset ready in my unix table(about 500
observation) called small0402. I would like to merge with the large dataset
which need to be extracted from Oracle datawarehouse(total about 7,000,000
observations). One way is as the following:
libname test1 '/diar1/dir2'; ***UNIX server****;
libname dw_lib Oracle user = xxx password=xxx path="xxx.xxx";
************ Create sas dataset by getting data from Oracle server
**************;
proc sql;
connect to oracle (user=xxx orapw=xxx path="xxx.xxx")
create table test1.huge0402 as
select * from connection to oracle
(
select DW.id
DW.day
DW.var1
DW.var2
DW.var3
from DW.table1
where DW.day between 200401 and 200429
);
quit;
proc sort data= test1.huge0402;
by id;
run;
***** Sort the small sas dataset stored in unix server ********;
proc sort data=test1.small0402;
by id;
run;
**** merge the small sas dataset with the large dataset got from
oracle server ***;
data test1.out0402;
merge test1.small0402(in=a)
test1.huge0402(in=b);
by id;
if a;
run;
My problem is that it is so time consuming to get the data from oracle
data warehouse and then merge to the small dataset. I think there should be
a way that I could directly get the subset from oracle data warehouse. But I
guess since my small sas dataset is saved in Unix server. It may need a step
to upload that to the oracle server. I may or may not have the write access.
So the following won't work.
proc sql;
connect to oracle (user=xxx orapw=xxx path="xxx.xxx")
create table test1.huge0402 as
select * from connection to oracle
(
select DW.id
DW.day
DW.var1
DW.var2
DW.var3
from DW.table1
where DW.day between 200401 and 200429
and id in (select id from test1.small0402)
);
quit;
proc contents data=test1.out0402;
run;
If anyone could help me with the above, I would really appreciate it.
Thanks a lot!