|
Christine,
Apparently, you have a case/control design since you are
using a STRATA statement. You also indicate that you have
80,000 case records and 80,000 control records which would
suggest further that you might have a 1:1 matched study.
If so, then you can restructure your data so that you can
use a simple logistic regression. That should solve your
out-of-memory problem.
So, if you have a 1:1 matched design, here is what you can
do. First, merge the matched case and control records
by stratum (subjid) renaming the exposure variable so that
you have a case exposure variable and a control exposure
variable. We want to compute the difference between the
two exposure variable values. At the same time, you need
to construct a new response variable which has value 1
for ALL records.
With the restructured data, you can fit the conditional
logistic regression model for the 1:1 matched design without
need for the STRATA statement. You can fit the model
employing an ordinary logistic regression WITHOUT AN
INTERCEPT and using the difference of the exposure variables
as the predictor variable.
Code for all of this (using the data set and variables shown
in your post) would be:
proc sort data=outf.tendon_short out=tendon_short;
by subjid;
run;
data matched_logistic_reg;
merge tendon_short(where=(case_flag=1)
rename=(exposure=exposure_case))
tendon_short(where=(case_flag^=1)
rename=(exposure=exposure_control));
by subjid;
exposure_diff = exposure_case - exposure_control;
response = 1;
run;
proc logistic data=matched_logistic_reg;
model response = exposure_diff / noint;
run;
This approach is described by Hosmer and Lemeshow in a
chapter on matched studies in their book "Applied Logistic
Regression". Now, if you have M:N matching, it will be
another whole kettle of fish. But let's start out with
the simple assumption first because I suspect that it will
meet your need.
By the way, if you do have M:N matching so that the above
solution will not work for you, then post back to the list
specifying the maximum values of M and N across all strata.
We should be able to write code for fitting a conditional
logistic regression using the procedure NLMIXED. But we
would again need to restructure the data to have all
of the case and control records which are in a stratum on
a single record. The NLMIXED procedure would require a
fair bit of programming to construct the likelihood.
I would rather not go there unless it is necessary.
Dale
---------------------------------------
Dale McLerran
Fred Hutchinson Cancer Research Center
mailto: dmclerra@NO_SPAMfhcrc.org
Ph: (206) 667-2926
Fax: (206) 667-5977
---------------------------------------
--- On Thu, 1/7/10, Christine Peloquin <christinepeloquin1@GMAIL.COM> wrote:
> From: Christine Peloquin <christinepeloquin1@GMAIL.COM>
> Subject: proc logistic: 'out of memory'
> To: SAS-L@LISTSERV.UGA.EDU
> Date: Thursday, January 7, 2010, 7:01 AM
> hello.
>
> i just started a job at BU. i am running proc logistic on a
> dataset with
> 160,000 observations (80,000 cases and 80,000 controls) -
> and am receiving
> an 'out of memory' message. here is the code that i
> am running:
>
> proc logistic data=outf.tendon_short;
> class exposure (ref='0') / param=ref;
> strata subjid;
> model case_flag (event='1') = exposure;
> run;
>
> both the case_flag and exposure variables are dichotomous
> (numeric
> variables; values: 0/1). the subjid is a 11-char
> variable.
>
> would anyone have a suggestion of how i could resolve this
> or what i should
> be looking at to further debug?
>
> endless thanks.
> christine
>
|