Date: Thu, 6 May 2004 02:15:44 -0400 Richard Ristow "SAS(r) Discussion" Richard Ristow Re: Coding Regression To: grib <3ac7e83b.0404231751.5da6dae9@posting.google.com> text/plain; charset="us-ascii"; format=flowed

This is truly ancient history in list-time, but I haven't noticed a response.

At 09:51 PM 4/23/2004, grib wrote:

>I would like to code a regression model that can be described by the >following example: >Y=b0+D1*b1*X1+D2*b2*X1+b3*X2+b4*X3 > >Y is the output/crop from two plots. Each observation comes from >either of the two plots. X1 is the amount of fertilizer from plot #1 >or #2, D1 and D2 are dummy variables set to 1 if an observation comes >from the corresponding plot. X3 and X4 are variables controlling for >other factors. The values of interest are the estimates of b1 and b2. > >Is there a way to code this regression in proc GLM in SAS using >interactions? I would like to use one model for both plots as opposed >to 2 models, one for each plot.

I'm not well up on the intricacies of GLM. This is about how to cast your problem as a regression model -- which is actually fairly reasonable to do. Once done, you can see whether estimating the same model in GLM helps: for example, makes it easier to recognize multi-level class variables explicitly.

You have: "Y is the output/crop from two plots." If you really have only two observations or measurements, of course you haven't a chance; your model has 5 degrees of freedom (b0 through b4); in principle, you need 5 observations to estimate them, in reasonable practice, 25 to 50.

If you have a lot of *pairs* of plots, with a "plot #1" and "plot #2" in each, and both plots in each pair receive the same amount of fertilizer (X1), AND Y is observed separately for each plot in each pair, then you have a reasonable multiple-regression model:

Y(i,j) = b0 + b(j)*Dj(j)*X1(i) + b3*X2(i) + b4*X3(i)

where 'i' indexes over the set of pairs of plots, and 'j' is 1 or 2. (I've written the model assuming that X2 and X3 are the same within each pair, but that could be changed.) b(j),j=1,2, are the b1 and b2 whose estimates you're interested in; and

Dj(k) = 1 if j=k,0 otherwise.

You'd rewrite this to estimate:

Z1(i,j) = X1(i) if j=1, 0 otherwise Z2(i,j) = X1(i) if j=2, 0 otherwise

which, of course, could be computed easily in SAS.

This only makes sense if there's some characteristic defining which is plot #1, and which is plot #2, in each pair. (If the assignment is arbitrary, you can adjust b(1) and b(2) over quite a range by changing which plot is which, and no estimate is better than another.)

Finally, if, as is likely, you test against the hypothesis that b(1)=b(2), re-cast:

Z1'(i,j) = (Z1(i,j)+Z2(i,j))/2 Z2'(i,j) = (Z1(i,j)-Z2(i,j))/2

and test whether the coefficient of Z2' is significantly different from 0.

Back to: Top of message | Previous page | Main SAS-L page