| Date: | Mon, 8 Dec 2008 11:49:34 -0800 |
| Reply-To: | Steve James <spj1@CDC.GOV> |
| Sender: | "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU> |
| From: | Steve James <spj1@CDC.GOV> |
| Organization: | http://groups.google.com |
| Subject: | Re: error:out of mempry |
|
| Content-Type: | text/plain; charset=ISO-8859-1 |
In addition to the Hash object, you might try creating a format of all
the valid record identifiers and using that to identify the records
you want. I came across the following code last week and it was fresh
in my mind.
proc sql;
create table eventfmt as
select distinct
put(event_id,6.) as start format=$6.
,"event" as fmtname
,"yes" as label format=$6.
from
unique_ids
insert into eventfmt set start='other' ,
fmtname="event" ,label='not' ;
quit;
proc format cntlin=eventfmt;
run;
/* grab all records for event_id's */
data merged ;
set duplicates(where=(put(event_id,event.) = 'yes'));
run;
Steve
On Dec 8, 3:21 am, rams <rameshg...@gmail.com> wrote:
> Hi all,
>
> I have a dataset comprising of around 200 Million records. These
> records contain duplicates. I have another dataset with around 500K
> unique records. Now, I have to create a new dataset by merging these
> two datasets and get the match of these 500K unique records from the
> 200Million records in other dataset.
>
> Solutions I tried my self are :
> 1. Proc Sql + Left join - failed after running for around 4 hours with
> Out Of Memory issue.
>
> PROC SQL; CREATE TABLE MERGED AS
> SELECT *
> FROM firstdataset AS A LEFT JOIN VISITS AS B
> ON A.HRN=B.CHART
> ;QUIT;
>
> 2. Used proc sort and sas data step merge statement. Again failed
> aftert 4 hours apporx with OOM error.
>
> Thanks,
> Ramesh G
|