Date: Mon, 12 May 1997 14:27:04 -0400
Sender: "SAS(r) Discussion" <SAS-L@UGA.CC.UGA.EDU>
From: "Qiang Teng (Tom)" <tteng@ON.BELL.CA>
Organization: Bell Canada
Subject: Re: DB2 and Sampling
Content-Type: text/plain; charset=us-ascii
Anthony Ayiomamitis wrote:
> Peter R Welbrock wrote:
> > I am trying to find an effective way to pull a sample of observations
> > (rows) from multiple DB2 tables. These tables vary in sizes from 50,000
> > to 2 million rows. I could just find out the number of rows in the
> > tables (the site I am contracting for does not keep DB2 statistics
> > because it is too expensive to do so .... a bit like those folks who buy
> > LandRovers to take their kids to school in!), pick a sample in the normal
> > way and then go through a full table scan to select those rows, but this
> > would take a great deal of time and resources.
> > I do not know any method of directly accessing DB2 tables (like with the
> > point= ability within SAS), but maybe someone out there can think of a
> > clever way around this. We do have ACCESS to DB2.
> > Thanks for your help.
> > Pete Welbrock
> > (who doesn't care if his opinions do not coincide with those of his
> > employers; that is their problem, not his!)
> I am not aware of a random number generator available in DB2.
> Therefore, your only alternative may be to do the random sampling
> through SAS. For the same of argument, assume you want a 5% sample -
> what you could do is the following:
> data sample;
> set db2extr;
> if uniform (0) le 0.05;
> You can get cute and use a macro variable for the size of the
> sample etc etc but the concept above should suffice.
If your DB2 tables are indexed then you could have direct access as
(1) pass the indexed field in your sample using CALL SYMPUT;
(2) then in your SQL you could reference this macro variable.
please reference "SAS Connections to DB2: Tools & Techniques" by
Judy Loren in SUGI 21 Proceedings, p498-507.