| Date: | Mon, 11 Sep 2000 16:44:42 GMT |
| Reply-To: | Dale McLerran <dmclerra@FHCRC.ORG> |
| Sender: | "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU> |
| From: | Dale McLerran <dmclerra@FHCRC.ORG> |
| Organization: | Fred Hutchinson Cancer Research Center |
| Subject: | Re: Programming Problem - Multidimensional Arrays |
|---|
dcaputo,
I would drop the multidimensional arrays approach. You are presently
computing N**2 as the number of variables in the multidimensional
array. Instead of passing N**2, why not compute and pass N*(N-1)/2
as the number of interactions. When constructing the interactions,
loop from I=1 to &numvar-1 and J=I+1 to &numvar. You also need to
form a counter which keeps track of how far into the unidimensional
array you are with the current interaction term. Here is some revised
code.
data _null_;
numint = &numvar*(&numvar-1)/2;
call symput('numint',trim(left(numint)));
run;
data logit.abc;
set logit.ss;
array itr{&numint} itr1 - itr&numint;
array col{&numvar} col1 - col&numvar;
k=0;
do i = 1 to &numvar-1;
do j = i+1 to &numvar;
k=k+1;
itr{k} = col{i}*col{j};
end;
end;
run;
Although you have not asked about this, let me make a few more comments
on your code. The process which you use to determine the number of
variables on the right hand side of your model uses a lot more
resources than necessary. Rather than transposing the entire dataset
logit.Z_inter, why not transpose one observation from logit.Z_inter?
proc transpose data=logit.Z_inter(obs=1) out=numvar(drop=_name_);
var &final;
run;
Later in your code, you use the transpose of dataset NUMVAR (logit.ss)
as input to a datastep where you construct the interaction effects.
Well, the transpose of NUMVAR should return logit.Z_inter. So,
datastep logit.abc can use as input logit.Z_inter rather than
dataset logit.ss. You can get rid of one proc transpose entirely,
and reduce one proc transpose to a trivial problem. That should
really speed up your processing.
In addition, you do not need the statement which appears in your
SQL code
%let numvar=&numvar;
Dale
--------------------------------------
Dale McLerran
Fred Hutchinson Cancer Research Center
Seattle, WA 98109
mailto:dmclerra@fhcrc.org
ph: (206) 667-2926
fax: (206) 667-5977
--------------------------------------
dcaputo@banet.net wrote in <39BC2AAC.B0E32A67@banet.net>:
>The goal is to take a dataset with N columns and to form N(N-1)/2 other
>columns that represent the 2-Way interactions of the variables in the
>original N columns. The program I have written so far creates N*N
>variables or new columns with a DO LOOP within a DO LOOP. Picture this
>as an N by N matrix, Xij. I want to change the program to output only
>those columns Xij where i > j. I tried a DROP in the DO LOOP but it
>doesn't work.
>
>
>
> HELP!
>
>
>/* target variable */
>
> %let target = ins;
>
>
>/* final model covariates */
>
>%let final = var1 var2 var3 var4;
>
>
>
>/* create SAS dataset with reduced variables only */
>
>data logit.Z_inter;
> set logit.A_ins (keep = &target &final);
>run;
>
>
>proc transpose data=logit.Z_inter out=numvar (drop=_name_);
> var &final;
>run;
>
>
>
>/* compute the number of variables in the final model and place in macro
>variable */
>
> proc sql noprint;
> select
> count(*)
>
> into
> :numvar
>
> from
> numvar;
>
> %let numvar = &numvar;
>quit;
>
>
>
>
>proc transpose data=numvar out=logit.ss (drop=_name_);
>run;
>
>
>
>/* compute number of elements in the square matrix
>
>
>data _null_;
> numint = &numvar*&numvar;
> call symput('numint',trim(left(numint)));
>run;
>
>
>
>data logit.abc;
> set logit.ss;
> array itr{&numvar,&numvar} itr1 - itr&numint;
> array col{&numvar} col1 - col&numvar;
>
> do i = 1 to &numvar;
>
> do j = 1 to &numvar;
>
> itr{i,j} = col{i}*col{j};
>
> end;
> end;
>run;
|