Date: Tue, 30 Jul 1996 20:35:24 EDT
Reply-To: whitloi1@WESTATPO.WESTAT.COM
Sender: "SAS(r) Discussion" <SAS-L@UGA.CC.UGA.EDU>
From: Ian Whitlock <whitloi1@WESTATPO.WESTAT.COM>
Subject: Re: Cartesian Product (Final Program)
Subject: Cartesian Product (Final Program)
Summary: Learn SQL
Respondent: Ian Whitlock <whitloi1@westat.com>
Richard Varron <RICH@WPC.WILPATERSON.EDU> posed the question of finding
all possible combinations of period and room. He got the answer, use
PROC SQL to take cartesian products and wrote the code in his message
attached below. It is unfortunate that he didn't use PROC SQL all the
way. (Of course if he knew SQL well enough, he wouldn't have been
asking the question in the first place. On the other hand, except for
data details and the additional variables introduced in his code, and
the use of the EXCEPT operator instead of the WHERE ... not in ...
clause; it is pretty much my original private response to him.)
Replacing his permanent file refs with work files here is a one step
SQL solution to creating the needed file.
proc sql ;
create table open as
select p.d format $day4. ,
p.period ,
r.roomid ,
int ( r.area / 16 ) as capacity ,
r.usg ,
r.roomused ,
r.studnm ,
r.roomuse
from period as p , space as r
where p.d || p.period || r.roomid not in
( select d || period || substr ( room , 1 , 6 )
from oroom
where _type_ = 28 )
order by p.d , p.period , calculated capacity
;
quit ;
Note there is none of the extra variables and files that permeate his
program. It is clear exactly what original files are needed and what
variables are needed and how they are obtained. Why did I make the
effort? I plan to use it as an example in an SQL course I am preparing
for the fall. For anyone wanting to execute the code above, here is
test data and the required format.
data period;
input d $1 period $11.;
cards ;
109:00-09:50
110:00-10:50
111:00-11:50
209:00-09:50
210:00-10:50
211:00-11:50
309:00-09:50
310:00-10:50
311:00-11:50
409:00-09:50
410:00-10:50
411:00-11:50
509:00-09:50
510:00-10:50
511:00-11:50
;
data space ;
input roomid $6. area 4. ;
retain usg ' ' roomused ' ' studnm . roomuse ' ' ;
cards ;
aaaaaa1000
bbbbbb 512
cccccc8000
dddddd 750
;
data oroom ;
input d $1. period $11. room $8. _type_ 2.;
cards ;
109:00-09:50aaaaaaxy28
309:00-09:50aaaaaaxy28
509:00-09:50aaaaaaxy28
209:00-09:50bbbbbbxy28
409:00-09:50bbbbbbxy28
210:00-10:50bbbbbbxy28
410:00-10:50aaaaaaxy28
109:00-09:50ccccccuv28
209:00-09:50ccccccuv28
309:00-09:50ccccccuv28
409:00-09:50ccccccuv28
509:00-09:50ccccccuv28
111:00-11:50aaaaaaxy38
311:00-11:50aaaaaaxy38
511:00-11:50aaaaaaxy38
;
proc format ;
value $day
'1' = "Mon."
'2' = "Tue."
'3' = "Wed."
'4' = "Thr."
'5' = "Fri."
;
run ;
Ian Whitlock <whitloi1@westat.com>
------------------------------- Message Contents -------------------------------
date sent: 30-JUL-1996 14:48:10
This is the final program that used the cartesian product.
The purpose of the program was to find all classroom NOT IN USE.
In order to get this information, I had to create a dataset of
all possible class-time-day combinations.
IN MATRIX FORM:
R X P - O = A
where R is the room matrix, P is the period matrix, O is
the room-period occupied matrix and A is the avilablity matrix.
O is generated by looking at each room at 5 minute intervals to see if it is in
use and then creating a sparse matrix.
*THIS GESTS THE LIST OF DAY-PERIOD COMINATIONS;
data period;
infile "period.dat";
input dperiod $12.;
*THIS CROSSES IT WITH ALL THE CLASSES;
*THESE ARE STORED IN ROOMINV.SPACE;
proc sql;
create table area.alltimes as
select roomid, dperiod from period, roominv.space(keep=roomid);
*THIS STEP CREATES A VARIABLE OF DAY-TIME-ROOM WHEN ROOMS CAN BE USED;;
data area.aroom;
set area.alltimes;
proom=dperiod || roomid;
proc sort data=area.aroom; by proom;
*THIS CREATES A DATA SET OF ALL ROOMS IN USE FOR EACH DAY/PERIOD;
*THIS IS GENERATED BY ANOTHER PROGRAM WHICH CHECKS ROOM USE EVERY 5 MINUTES;
*I HAD TO CONVERT VARIABLES TO THE SAME LENGTH AND TYPE;
data area.oroom;
length roomid $6 dperiod $12 proom $18;;
set cohort.roomsum(where=(_type_=28));
roomid=substr(room,1,6);
if d=1 then q="1";
if d=2 then q="2";
if d=3 then q="3";
if d=4 then q="4";
if d=5 then q="5";
dperiod= q || trim(period);
proom=dperiod || roomid;
flag="X";
proc sort; by proom;
*HERE I SUBTRACTED OUT THOSE ROOMS IN USE FROM THOS AVIALABLE;
data area.openroom;
merge area.aroom(in=one drop=dept) area.oroom(in=two); by proom;
if two or flag="X" then delete;
proc sort; by roomid;
proc sort data=roominv.space out=space; by roomid;
*HERE I INJECTED BACK THE ROOM CAPACITY ON OPEN ROOMS;
data AREA.OPEN;
merge area.openroom(in=one) space(in=two drop=dept); by roomid;
if one;
capacity=area/16;
daya=substr(dperiod,1,1);
if daya="1" then day="1. MON";
if daya="2" then day="2. TUE";
if daya="3" then day="3. WED";
if daya="4" then day="4. THU";
if daya="5" then day="5. FRI";
period=substr(dperiod,2,11);
*ROOM TYPE 110 IS CLASSROOM;
proc sort data=area.open; by day period capacity;
proc print data=area.open(where=(roomuse in (110))); by day period; id day
period;
var capacity roomid usg roomused studnm roomuse ;
Richard Varron, Programmer I / SAS Representative
******************************************************************
* Data Processing / Office of Planning, Research and Evaluation *
* Adjunct Professor, English Department, WPC *
* William Paterson College, 300 Pompton Rd., Wayne, NJ 07470 *
* INTERNET: rich@wpc.wilpaterson.edu *
* HTTP://WWW.WILPATERSON.EDU/COURSES/ENG150/INTOLIT.HTM *
******************************************************************