Date: Thu, 24 Dec 1998 13:25:09 -0500
Reply-To: Jules Bosch <jxb@BELLATLANTIC.NET>
Sender: "SAS(r) Discussion" <SAS-L@UGA.CC.UGA.EDU>
From: Jules Bosch <jxb@BELLATLANTIC.NET>
Organization: Bosch Systems Inc.
Subject: Re: selecting from large dataset using small one
Content-Type: text/plain; charset=us-ascii
Ross Bettinger wrote:
>
> I have a dataset containing many (millions) of transaction records with
> a many-to-one relationship between transactions and the account to which
> they belong. I have a much smaller dataset containing accounts which I
> want to use to control the selection of the transaction records.
>
> I am running SAS6.12 on a UNIX Sun platform with limited disk space.
>
> What, in your opinion, is the best way to proceed with the selection?
>
> 1) sort both files and do DATA step merge using IN variables
> 2) use PROC SQL such as
> proc sql ;
> create table trans as
> select * from transactn
> where account in ( select account from acctdata ) ;
> 3) create an embeddable text file with the code
> data trans ;
> set trans ;
>
> if account in (<list of account numbers>) ;
> run ;
>
> 4) your suggestion here
>
> All suggestions are welcome. Please remember that disk space is a
> limited resource.
>
> Thanks,
> Ross Bettinger
Ross,
If your objective is to simply control the selection of the transaction
records the following may be feasible.
/* Create a macro (&LIST) that contains a list of the Account Numbers to
be selected */
proc sql noprint;
select quote(trim(left(acct_num)))
into :list separated by " "
from work.tinyfile;
quit;
/* Note: The final length of the macro &LIST should not be an issue as
the max length of a macro is in the 32k range depending on the OS */
/* Now, either of the following should work */
proc sql ;
create table work.trans as
select * from transactn
where acct_num in (&list);
quit;
or
data work.trans;
set transactn (where=(acct_num in(&list)));
run ;
HTH.
Jules Bosch