Date: Thu, 14 Jul 2005 15:30:53 -0400
Reply-To: Sigurd Hermansen <HERMANS1@WESTAT.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Sigurd Hermansen <HERMANS1@WESTAT.COM>
Subject: FW: proc sql union problem
Content-Type: text/plain; charset="us-ascii"
Pura:
Often it helps to review the information your data contain and ask what
does not appear in data that would make queries simpler and more
precise. In this case, a time stamp on each row would support selection
of rows last added to a group of rows for a cycle. Proper database
design puts information that a programmer needs in data.
Since you would have to rely on the sequence of rows to collapse
multiple cycle rows in b to one row, at least add a sequence column
variable to a and b.
In database terms, your problem occurs routinely in database updates. If
updating using one update transaction at a time, you would first check
to see if the table to be updated contains a row with the same key value
(x) as the transaction; if so, update target column (y); else insert the
row.
I've included a simple (though not efficient for large-scale updates)
example:
data a;
input x $ y $ ;
source='a';
seq=_N_;
cards;
cycle1 2
cycle2 3
cycle3 4
cycle4 9
;run;
data b;
input x $ y $ ;
source='b';
seq=_N_;
cards;
cycle4 5
cycle5 6
;run;
proc sql;
update a
set y=(select y from b
where a.x=b.x)
where exists (select * from b
where a.x=b.x)
;
create table c as
select * from b
where x not in (select x from a)
;
insert into a
select * from c
group by x
having seq=max(seq)
;
drop table c
;
quit;
Sig
-----Original Message-----
From: owner-sas-l@listserv.uga.edu [mailto:owner-sas-l@listserv.uga.edu]
On Behalf Of Pura Bellido
Sent: Thursday, July 14, 2005 12:43 AM
To: sas-l@uga.edu
Subject: proc sql union problem
in the following sample I would like to use the data in "data a" to be
replaced by any new data in "data b"
data a;input x $ y $ ;cards;
cycle1 2
cycle2 3
cycle3 4
cycle4 9
;run;
data b;input x $ y $ ;cards;
cycle4 5
cycle5 6
;run;
proc sql;create table new as select *
from a union all corresponding
select * from b
;quit;run;
data _null_;set new;put(_all_)(=);run;
thus the item "cycle4" should have a value y as "5" and not both values
"9" and "5"
how might I change the sql select statement?
additionally, it is possible that a value for "x" in data a has multiple
values, like this cycle1 2 cycle2 3 cycle3 4 cycle4 9 cycle4 3
the desired goal would be to replace any values from "data b" so in this
case, the last two values for "cycle4" containing x=9 and x=3 would be
replaced with a single row where x=5 from "data b"