| Date: | Thu, 8 Apr 1999 15:24:25 -0400 |
| Reply-To: | RAITHEM <RAITHEM@WESTAT.COM> |
| Sender: | "SAS(r) Discussion" <SAS-L@UGA.CC.UGA.EDU> |
| From: | RAITHEM <RAITHEM@WESTAT.COM> |
| Subject: | (MVS) Re:Problem: Proc SQL with ODBC connection to AS/400 |
|
| Content-Type: | text/plain; charset=US-ASCII |
Koen Vyverman posted an interesting question.
<<Koen's long question can be found below the Sig. line below>>
<<Koen's basic problem was that he had a 50 - 100 million row table that existed
on a remote AS/400 machine, and a key table on his desktop that contained 14,000
key values. He wanted to extract all rows for the BIG table that had matching
key values in the key file. He tried two methods that are listed in his
posting, below, and asked for some further suggestions.>>
Koen, I would be inclined to pick a solution where the power and might of the
AS/400 was used to do all of the heavy work, and only the result set was passed
back to my desktop. Since you have SAS/CONNECT, this is relatively
straight-forward. I would do something along the following lines:
1. Upload the "skinny" Key table to the AS/400 via PROC UPLOAD. This might look
like this:
/****************************************************/
/* set a lib identifier for the PC SAS data library */
/****************************************************/
libname PCLIB "KOENS/PC/DIRECTORY";
/**********************************************************/
/* RSUBMIT this block of code to run on the AS/400 server */
/**********************************************************/
RSUBMIT;
/********************************************************/
/* set a lib identifier for the AS/400 SAS data library */
/********************************************************/
libname AS400LIB "KOENS/AS400/DIRECTORY";
/***********************************************************/
/* Move SAS data set from desktop to AS/400 for processing */
/***********************************************************/
proc upload data=PCLIB.KEYFILE out=AS400LIB.KEYFILE;
run;
ENDSUBMIT;
2. Then, I would run the re-written SQL query from Example 1, below. It would
be rewritten to specify that the Keyfile was on the AS/400.
3. When the SQL query completed processing, I would download the resulting data
set via a PROC DOWNLOAD. To do this, simply rewrite the SAS code in Example 1,
but use PROC DOWNLOAD, and modify the DATA= and OUT= keywords as necessary.
Koen, and other sharpies, yes I _DO_ know that all three separate steps are not
really necessary. It would make more sense to include the SQL in Step #2, and
the PROC DOWNLOAD in Step #3 in the submit block. But, listing them out as
three separate steps gets the point across with little or no confusion. And,
I'm sure that you will be able to take this suggestion and run with it... unless
something better comes along!
Koen, best of luck in solving your cross-platform headaches!
I hope that this suggestion proves helpful now, and in the future!
Of course, all of these opinions and insights are my own, and do not reflect
those of my organization or my associates.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Michael A. Raithel
"The man who wrote the book on performance."
E-mail: raithem@westat.com
Author: Tuning SAS Applications in the MVS Environment
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
You can observe a lot just by watching -- Yogi Berra
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Koen's original posting:
Here's a question for the SQL / ODBC / AS/400 gurus among us...
On a remote AS/400 machine lives a fairly large (50-100 million
records) table that has a non-unique key-variable holding positive
integer values in the range 1 to 500000.
I am using PC SAS (6.12, TS050) under Win95 to download those
records that have a number of specific key-values. I'm interested
in about 14000 key-values out of the possible 500000, expecting
at most a million records to be returned. My 14000 wanted key-
values sit in the KEY variable of a local PC SAS data set
WORK.KEY and are not available as such in an AS/400 table.
As a first attempt, I constructed my query in the following vein:
-----------------------------------------------------
VERSION 1
-----------------------------------------------------
proc sql;
connect to odbc("DSN=YARC40;"||
"SYSTEM=YARC40;"||
"UID=WOOZLE;"||
"PWD=YABBADABBA");
create table daily as
select
keyvar label='Key Varble' ,
stuff1 label='Some Stuff' ,
stuff2 label='More Stuff' ,
stuff3 label='Other Stuff'
from connection to odbc(
select
keyvar,
stuff1,
stuff2,
stuff3
from
forxzcqy.keugganz
)
where keyvar in (
select key
from work.keys
);
disconnect from odbc;
quit;
-----------------------------------------------------
My understanding is (and please correct me if I'm wrong) that
this way of working will pull every single record from the
AS/400 table forxzcqy.keugganz through the ODBC-connection
before checking the value of KEYVAR, which will then in most
cases result in the record being trashed anyway.
A waste, really, especially with a relatively slow connection.
Hence my second effort. I proceeded to generate dynamic code
that explicitly states the where-clause inside the "connection
to odbc( )" parentheses:
-----------------------------------------------------
VERSION 2
-----------------------------------------------------
proc sql;
connect to odbc("DSN=YARC40;"||
"SYSTEM=YARC40;"||
"UID=WOOZLE;"||
"PWD=YABBADABBA");
create table daily as
select
keyvar label='Key Varble' ,
stuff1 label='Some Stuff' ,
stuff2 label='More Stuff' ,
stuff3 label='Other Stuff'
from connection to odbc(
select
keyvar,
stuff1,
stuff2,
stuff3
from
forxzcqy.keugganz
where keyvar in (
7072,
8060,
11703,
11711,
...
486267,
496275,
496283
)
);
disconnect from odbc;
quit;
-----------------------------------------------------
Running this code with a limited subset of a few hundred key-
values works like a breeze. Unfortunately, when I try to
include all 14000 of them, I seem to be running into some
sort of size limit (probably the 32K thing in one of its
incarnations) and an ODBC error message in the log informs
me that the resources limit has been exceeded.
I find that I can include about 600 key-values at a time
with this approach, but I really don't feel like cutting
up my WORK.KEYS data set into 24 mouthfuls and generate/execute
as many queries. Besides, version 1 would probably perform
better than that...
So, suggestions and comments as to how to improve the per-
formance of querying AS/400 tables will be greatly appreciated.
Also, are there other ways than ODBC/SQL to do this kind of
thing? Our license includes 'CONNECT', 'PC File Formats', 'ODBC'
and the usual ones...
Cheers,
Koen.
---------------------------------
Koen Vyverman
Mathematician & SAS-consultant
EDS - Luxembourg
Phone : (++352) 250 404 22 63
Fax : (++352) 250 404 23 15
---------------------------------
|