Date: Fri, 24 Jan 2003 17:17:38 -0500
Reply-To: Shane Hornibrook <shane_sasl_nospam1@SHANEHORNIBROOK.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Shane Hornibrook <shane_sasl_nospam1@SHANEHORNIBROOK.COM>
Subject: Re: PROC SQL Question
In-Reply-To: <2fb667c8.0301240854.218ec442@posting.google.com>
Content-Type: TEXT/PLAIN; charset=US-ASCII
Garland,
You've gotten great responses from Sigurd Hermansen, David L. Cassell, and
JP. I have a bit to add to their code. If you run some of the example code
using "select * from" and joining varibles with the same name you will
notice WARNINGs in the log.
Using David L. Cassell's code as the base;
proc sql noprint;
create table dm as
select * from ds1 as a left join ds2 as b
on a.subjid=b.subjid and a.lbtst=b.lbtst;
quit;
If you run the code you will get warnings similar to
"WARNING: Variable subjid already exists on file WORK.DM."
"WARNING: Variable lbtst already exists on file WORK.DM."
You can avoid these warnings by adding a RENAME to one set of variables,
and dropping the renamed variables on the created table. It is not a
required part of the SQL join, but it keeps your log file uncluttered.
The modified code looks like;
proc sql noprint;
create table dm(drop=subjid_temp lbtst_temp) as
select * from ds1 as a left join ds2
(rename=(subjid=subjid_temp lbtst=lbtst_temp))
as b
on a.subjid=b.subjid_temp and a.lbtst=b.lbtst_temp;
quit;
run;
It looks a little 'crufty', but it works, and may save you a bit of time
when reviewing your log files. On the downside, your code will run perhaps
30% slower than the non-renamed-variable SQL. This performance hit is in
addition to the overhead of using PROC SQL vs the comparable datastep.
This may become an issue if you are joining more than 500,000 records (As
written and tested on an IBM AIX RS/6000 SAS v8 .... your mileage may
vary).
In this case it seems faster (in all regards) to code and run the
equivalent data step.
--Shane
Shane Hornibrook
Mobile: (902)441-4158
shane_sasl_nospam1@shanehornibrook.com
http://www.shanehornibrook.com/
On Fri, 24 Jan 2003, Garland wrote:
> Hi,
>
> I have the following program. Can anybody give me the PROC SQL query
> for the following datastep. I am learing proc sql just started to
> convert the previously written programs to proc sql as an exercise.
>
> Thanks in advance,
> Garland
>
> Data ds1;
> Input subjid lbtst$ lowhigh $;
> Datalines;
> 1001 ALBUMIN NH
> 1001 BUN NL
> 1002 CALCIUM NH
> 1002 BUN NL
> 1002 LDH NH
> ;
>
> Data ds2;
> Input subjid studydy lbtst$ ;
> Datalines;
> 1001 0 ALBUMIN
> 1001 1 ALBUMIN
> 1001 7 ALBUMIN
> 1001 14 ALBUMIN
> 1001 0 BUN
> 1001 1 BUN
> 1001 1 URIC_ACID
> 1001 2 GLUCOSE
> 1002 1 CALCIUM
> 1002 1 BUN
> 1002 0 LDH
> 1003 2 SODIUM
> 1003 2 LDH
> ;
>
> run;
>
> proc sort data=ds1;
> by subjid lbtst;
> run;
>
> proc sort data=ds2;
> by subjid lbtst;
> run;
>
> data dm;
> merge ds1(in=aa) ds2;
> by subjid lbtst;
> if aa;
> run;
>