LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (December 2008, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Thu, 11 Dec 2008 17:37:44 -0800
Reply-To:   stringplayer_2@yahoo.com
Sender:   "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:   Dale McLerran <stringplayer_2@YAHOO.COM>
Subject:   Basic matrix operations in data steps and procedures
Content-Type:   text/plain; charset=us-ascii

Great news for those who have wanted to have the ability to perform basic matrix operations within a SAS data step or within SAS procedures that allow programming statements (like the NLMIXED procedure, the experimental MCMC procedure, and several others). With version 9.2, there is some basic matrix operation functionality available.

Let me try to describe how to use this functionality. It is a bit convoluted since the functionality depends on whether you are invoking the matrix operators from within a procedure or within a data step.

Procedures which allow programming statements allow ARRAY statements that appear very much the same as data step ARRAY statements. There are a few differences, though, between procedural arrays and data step arrays. I can't say that I fully understand the differences between procedural arrays and data step arrays, but one of the differences would appear to be that procedural arrays are always temporary arrays. If you try to invoke the matrix operators in a data step employing an array which is not a temporary array, then the matrix operations will fail.

There are other differences between data step and procedural arrays. I don't know whether it is due to those other differences or not, but the matrix operators that are directly available to a SAS procedure are not directly available to the data step. In order to make the matrix operators available to a data step, one has to write a subroutine using the FCMP procedure which takes input from a data step array and passes that input on in the form required for the matrix functions.

Below are examples. I will demonstrate a matrix inversion operator. Note that in order to use the matrix operators at all, one must specify the option

options cmplib=<library>;

I have specifically invoked this as:

options cmplib=sasuser.funcs;

Only after that option has been specified are the matrix operators available.

Now, suppose that we are writing code for a PROCedure and that we want to invert a covariance matrix. The covariance is a square matrix, which we place in a two-dimensional array which has common number of elements in both dimensions. Thus, we must have first defined an array and assigned values to the elements of the array. For an AR(1) covariance structure and a covariance matrix of size 4, we might have the following statements:

array cov [4, 4]; do i=1 to 4; do j=1 to 4; cov[i,j] = rho**(abs(i-j))*V; end; end;

Again, please note that this code is intended for a procedure, not a data step. The ARRAY statement does not specify that we have a _TEMPORARY_ array.

Now, we can invert the covariance matrix using the INV() subroutine. We might like to place the inverse in another square matrix, so we will first define another array and then invoke the INV() subroutine:

array inv_cov [4,4]; call inv(cov, inv_cov);

The INV function takes two arguments: 1) the name of the matrix which is to be inverted, and 2) the name of the resultant matrix.

If we try this code in a data step, it will fail with a message that the INV subroutine is unknown. Note, too, that there is a note in the code below of an illegal array reference when the statement CALL INV(V,V); is encountered.

data _null_; array v [3,3] _temporary_; v[1,1]=1.00; v[1,2]=0.50; v[1,3]=0.25; v[2,1]=0.50; v[2,2]=1.00; v[2,3]=0.50; v[3,1]=0.25; v[3,2]=0.50; v[3,3]=1.00; call inv(v,v); put v[1,1]= v[1,2]= v[1,3]= / v[2,1]= v[2,2]= v[2,3]= / v[3,1]= v[3,2]= v[3,3]= /; run;

In order to make the INV function available in a data step, one must write a subroutine using the FCMP procedure. The subroutine is essentially a pass-thru routine. We construct a subroutine INVERT which accepts a matrix as input, invokes the INV subroutine, and returns the second argument to the INV routine as output.

proc fcmp outlib=sasuser.funcs.mat; subroutine invert(Mat[*,*], InvMat[*,*]); outargs InvMat; call inv(Mat, InvMat); endsub; run; quit;

Now, the subroutine INVERT is available to a data step.

data _null_; array v [3,3] _temporary_; v[1,1]=1.00; v[1,2]=0.50; v[1,3]=0.25; v[2,1]=0.50; v[2,2]=1.00; v[2,3]=0.50; v[3,1]=0.25; v[3,2]=0.50; v[3,3]=1.00; call invert(v,v); put v[1,1]= v[1,2]= v[1,3]= / v[2,1]= v[2,2]= v[2,3]= / v[3,1]= v[3,2]= v[3,3]= /; run;

(Try the above code without the _TEMPORARY_ specification and see what happens. In the DATA step, _TEMPORARY_ is required!)

So, now it may be of interest to see whether the matrix inversion routine produces a generalized inverse. Suppose that we have a covariance matrix which has a row and column which are zero. Will the INVERT (invoking the INV() routine) generate a generalized inverse or will the matrix inversion fail?

data _null_; array v [3,3] _temporary_; v[1,1]=1.00; v[1,2]=0.50; v[1,3]=0; v[2,1]=0.50; v[2,2]=1.00; v[2,3]=0; v[3,1]=0; v[3,2]=0; v[3,3]=0; call invert(v,v); put v[1,1]= v[1,2]= v[1,3]= / v[2,1]= v[2,2]= v[2,3]= / v[3,1]= v[3,2]= v[3,3]= /; run;

It is seen from above that the matrix inversion algorithm is not a generalized inverse algorithm. If different subjects had different sized covariance matrices that we wanted to invert, then we would need to define arrays of every size that one might encounter in the data, query how many observations the subject has, write the correct sized matrix, and invert the correct sized matrix.

The code below shows the basic idea where there are two different sized matrices. If the data are stored first in the largest possible matrix and if that matrix is too large to fill completely, then we must fill a smaller matrix that can be filled completely and invert the smaller matrix.

data _null_; array v3_3 [3,3] _temporary_; v3_3[1,1]=1.00; v3_3[1,2]=0.50; v3_3[1,3]=0.25; v3_3[2,1]=0.50; v3_3[2,2]=1.00; v3_3[2,3]=0.50; v3_3[3,1]=0.25; v3_3[3,2]=0.50; v3_3[3,3]=1.00; call invert(v3_3,v3_3); put v3_3[1,1]= v3_3[1,2]= v3_3[1,3]= / v3_3[2,1]= v3_3[2,2]= v3_3[2,3]= / v3_3[3,1]= v3_3[3,2]= v3_3[3,3]= /;

v3_3[1,1]=1.00; v3_3[1,2]=0.50; v3_3[1,3]=0; v3_3[2,1]=0.50; v3_3[2,2]=1.00; v3_3[2,3]=0; v3_3[3,1]=0; v3_3[3,2]=0; v3_3[3,3]=0; array v2_2 [2,2] _temporary_; do i=1 to 2; do j=1 to 2; v2_2[i,j] = v3_3[i,j]; end; end; call invert(v2_2,v2_2); put v2_2[1,1]= v2_2[1,2]= / v2_2[2,1]= v2_2[2,2]=; run;

The matrix operators which are available in 9.2 (all invoked employing a CALL statement) are:

ADDMATRIX(M1,M2, result) /* M1, M2 and result are m x n */ DET(M, det) /* M is m x m, det is scalar */ ELEMMULT(M1,M2, result) /* M1, M2 and result are m x n */ EXPMATRIX(M, t, result) /* M and result are m x m, t is scalar */ IDENTITY(M) /* M is m x m and becomes I(m) */ INV(M1, M2) /* M1 and M2 are m x m */ MULT(M1,M2, result) /* M1 m x n, M2 n x p, result m x p */ POWER(M, t, result) /* M and result are m x m, t is scalar */ SUBTRACTMATRIX(M1,M2, r) /* M1, M2, and r(esult) are m x n */ TRANSPOSE(M1, M2) /* M1 is m x n, M2 is n x m */ ZEROMATRIX(M1) /* M1 is m x n, becomes zero filled */

I know others have asked at times about matrix capabilities available in a data step. I hope this is sufficient to get people started.

Dale

--------------------------------------- Dale McLerran Fred Hutchinson Cancer Research Center mailto: dmclerra@NO_SPAMfhcrc.org Ph: (206) 667-2926 Fax: (206) 667-5977 ---------------------------------------


Back to: Top of message | Previous page | Main SAS-L page