|Date: ||Thu, 6 May 2004 02:15:44 -0400|
|Reply-To: ||Richard Ristow <wrristow@MINDSPRING.COM>|
|Sender: ||"SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>|
|From: ||Richard Ristow <wrristow@MINDSPRING.COM>|
|Subject: ||Re: Coding Regression|
|Content-Type: ||text/plain; charset="us-ascii"; format=flowed|
This is truly ancient history in list-time, but I haven't noticed a response.
At 09:51 PM 4/23/2004, grib wrote:
>I would like to code a regression model that can be described by the
>Y is the output/crop from two plots. Each observation comes from
>either of the two plots. X1 is the amount of fertilizer from plot #1
>or #2, D1 and D2 are dummy variables set to 1 if an observation comes
>from the corresponding plot. X3 and X4 are variables controlling for
>other factors. The values of interest are the estimates of b1 and b2.
>Is there a way to code this regression in proc GLM in SAS using
>interactions? I would like to use one model for both plots as opposed
>to 2 models, one for each plot.
I'm not well up on the intricacies of GLM. This is about how to cast
your problem as a regression model -- which is actually fairly
reasonable to do. Once done, you can see whether estimating the same
model in GLM helps: for example, makes it easier to recognize
multi-level class variables explicitly.
You have: "Y is the output/crop from two plots." If you really have
only two observations or measurements, of course you haven't a chance;
your model has 5 degrees of freedom (b0 through b4); in principle, you
need 5 observations to estimate them, in reasonable practice, 25 to 50.
If you have a lot of *pairs* of plots, with a "plot #1" and "plot #2"
in each, and both plots in each pair receive the same amount of
fertilizer (X1), AND Y is observed separately for each plot in each
pair, then you have a reasonable multiple-regression model:
Y(i,j) = b0 + b(j)*Dj(j)*X1(i) + b3*X2(i) + b4*X3(i)
where 'i' indexes over the set of pairs of plots, and 'j' is 1 or 2.
(I've written the model assuming that X2 and X3 are the same within
each pair, but that could be changed.) b(j),j=1,2, are the b1 and b2
whose estimates you're interested in; and
Dj(k) = 1 if j=k,0 otherwise.
You'd rewrite this to estimate:
Z1(i,j) = X1(i) if j=1, 0 otherwise
Z2(i,j) = X1(i) if j=2, 0 otherwise
which, of course, could be computed easily in SAS.
This only makes sense if there's some characteristic defining which is
plot #1, and which is plot #2, in each pair. (If the assignment is
arbitrary, you can adjust b(1) and b(2) over quite a range by changing
which plot is which, and no estimate is better than another.)
Finally, if, as is likely, you test against the hypothesis that
Z1'(i,j) = (Z1(i,j)+Z2(i,j))/2
Z2'(i,j) = (Z1(i,j)-Z2(i,j))/2
and test whether the coefficient of Z2' is significantly different from 0.