Kevin,

I am not sure whether proc sql is the most
efficient way for this case, even though I am
a sql fan. Some data step and proc sort will
do the job though:

data one;
input id b_diag1 $ b_diag2 $ b_diag3 $;
cards;
1 a b c
2 c d e
3 f g h
;
run;

data two;
input code $;
cards;
a
h
j
;
run;

proc sort;
by code;

data one (keep=id code);
 set one;
array bd $ b_diag1-b_diag3;
do over bd;
code=bd;
output;
end;

proc sort;
by code id;

data keep;
  merge one (in=one) two (in=two);
by code;
if one and two;

proc sort data=keep nodupkey;
by id;

proc print;
run;

-----------------------
  OBS    ID    CODE

   1      1     a 
   2      3     h 


regards,

Ya Huang


-----Original Message-----
From: kviel [mailto:kviel@GMCF.ORG]
Sent: Monday, January 08, 2001 10:22 AM
To: SAS-L@LISTSERV.UGA.EDU
Subject: Efficiency in SQL


I have a dataset with multiple diagnoses for each observation, most of which
are missing.  I have a second dataset of codes meeting my case definition.
I decided to use the SQL procedure to identify any observation with at least
one diagnosis that meets the case definition.  I have written the following,
but wonder if I have created a resource intensive beast.  Thankfully, there
are only ten diagnosis fields per observation and no more than 500 case
definition codes.  However, there are millions of observations.  I thought
if there might be an equivalent to an ARRAY[] that I would not need the OR's
and would only need to SELECT * FROM two once.
Thanks in advance for any ideas.  Below is a sample:

data one;
input id b_diag1 $ b_diag2 $ b_diag3 $;
cards;
1 a b c
2 c d e
3 f g h
;
run;

data two;
input code $;
cards;
a
h
j
;
run;

proc sql;
select id
from one
where b_diag1 in (select * from two) or
      b_diag2 in (select * from two) or
        b_diag3 in (select * from two)
;
quit;

Kevin


Kevin Viel
Georgia Medical Care Foundation
57 Executive Park South, NE
suite 200
Atlanta, GA  30329-2224


------------------------
CONFIDENTIALITY NOTICE: This e-mail transmission, and any documents, files
or previous e-mail messages attached to it may contain proprietary,
privileged or confidential information.  If you are not an intended
recipient, or a person responsible for delivering it to the intended
recipient, you are hereby notified that any disclosure, copying,
distribution or use of any of the information contained in or
attached to this transmission is STRICTLY PROHIBITED.  If you have
received this transmission in error, please immediately notify me by
reply e-mail and destroy the original transmission and its
attachments without saving them in any manner.
-----------------------