Date: Tue, 7 Feb 2006 14:58:06 -0800
Reply-To: Dale McLerran <stringplayer_2@YAHOO.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Dale McLerran <stringplayer_2@YAHOO.COM>
Subject: Re: Proc IML Problems
Content-Type: text/plain; charset=iso-8859-1
--- Pete Larsen <phl7@CORNELL.EDU> wrote:
> Hi SASFolks-
> Background: I am attempting to have SAS repeatedly draw (10,000
> using means and sigmas of four variables to create four distributions
> be used later in a monte-carlo analysis. The variables are measures
> weather and I would like to have the corresponding distributions
> based on
> the joint probabilities of each of the four measures of weather
> that is why the covariances were calculated in the code below). I am
> close to having this loop fully functional BUT....
> I have a couple of questions regarding PROC IML:
> 1) How should I go about having the CALL function in the IML command
> from A) a NORMAL distribution for two of the variables: HDD & CDD and
> B) a
> LOGNORMAL distribution for the remaining two variables: P_TTL & P_STD
> the same Proc IML loop?
If you want to simulate from the joint distribution of your four
variables, then it would be beneficial to compute the log(P_TTL)
log(P_STD) so that you obtain ... drum roll please ... two normal
responses (call them L_P_TTL and L_P_STD) to go along with the
two normal responses that you already have, HDD and CDD. Then
you can compute the means and variances of your four normally
Now simulate the distribution of HDD, CDD, L_P_TTL, and L_P_STD.
Remember, these are all normally distributed variables. Exponentiate
L_P_TTL and L_P_STD in order to obtain simulated P_TTL and P_STD.
This assumes, of course, that the four variables HDD, CDD, L_P_TTL,
and L_P_STD have a multivariate normal distribution. That may or,
more likely, may not be true. In order to obtain data which
are drawn from the observed joint distribution of your four
variables, you might want to change from a simulation study
to a study where you resample from the observed data.
> 2) Because these are weather variables with calculated means that are
> often very small (<0.000001) with relatively large sigmas,
> some of the 10,000 drawn values are negative (esp. for P_TTL &
> P_STD). Is
> there a way to have SAS ensure that the values remain positive
> without a
> huge "lump" of values at zero that make the final normal/lognormal
> distribution of drawn weather look not so normal/lognormal, etc.
> truncating makes some of dist. look bimodal). In short, negative
> draws do not make sense and should not be included.
Well, if you employ the approach which I suggest above of first
constructing log(P_TTL) and log(P_STD), finding the mean and
covariance structure for these two variables along with HDD and
CDD, generating four normally distributed responses and
exponentiating the columns which correspond to log(P_TTL) and
log(P_STD), then you will not obtain any nonpositive values for
P_TTL and P_STD. You could obtain some nonpositive values for HDD
and CDD. But that would rather depend on the means and variances
of these two variables Would it be a problem for your simulations
if there were negative values for HDD and CDD?
IF you must generate positive values for all responses AND IF the
variances of HDD and CDD are sufficiently large that you cannot
guarantee nonnegative responses for these two variables in your
simulations, THEN I would push you even further toward consideration
of a resampling study rather than a simulation study.
> Thanks for your help. Keep on keeping on.
I haven't heard that phrase for 25 years! Can't say that I have
missed it. Are you old enough to remember when it was in vogue
back in the 70's? Or is it a phrase that has been recycled and
is back in use now?
Fred Hutchinson Cancer Research Center
Ph: (206) 667-2926
Fax: (206) 667-5977
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around