Date: Tue, 15 Oct 2002 09:30:28 -0400
Reply-To: Magnus Mengelbier <magnus.mengelbier@FERRING.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Magnus Mengelbier <magnus.mengelbier@FERRING.COM>
Subject: Re: Random Selection
Hello Richard
Welcome ... I have found the need to simulate data on almost a monthly
basis to test this or that.
I would set it up in a couple of steps...
* First i would set up one or a few formats. One format you might want is
the housing type, ie flat, etc.. Another might be your age categories just
for readability.
* I would first simulate your ages, that is simulate your population by age
using whatever distribution you would like or need. We have ranuni, ranexp,
ran-whatever... so it all depends what your population "should" look like
and what variance therefrom you want to look at. Might be a good idea to
get this done as a separate step before we jump on the domicile.
* Next step using categories on the age, I would use rantbl() since you
wanted to include age categories as a little factor on changes in
probability of domicile.
I would start doing your age catagories as if-statements to begin with
since it can get quite messy and then move on to some funny algorithms if
you feel the need.
HTH
Magnus
/*------- some very untested code ---------------*/
proc format library=work;
value domicile 1 = 'flat'
2 = 'house'
3 = 'bungalow';
value agecat low - 35 = 'say we are young'
35 - 55 = 'two BMW and a house with 1 dog and a 2 kids'
55-high = 'we can get back at our parents for spoiling
the kids';
run;
data ages;
do subject = 1 to 30000;
/* -- RANUNI() -- all ages have the same
probability of occuring -- */
age = floor(17 + 73*ranuni(date()));
output;
end;
run;
data work.domicile;
set work.ages;
/* -- say we are young until we are 35 -- */
if (17 <= age < 35) then domicile = rantbl(date(), 0.5, 0.25, 0.25);
/* -- two BMW and a house with 1 dog and a 2 kids -- */
if (35 <= age < 55) then domicile = rantbl(date(), 0.25, 0.5, 0.25);
/* -- the golden years... we can get back at our parents for spoiling
the kids -- */
if (55 <= age <= 90) then domicile = rantbl(date(), 0.25, 0.25, 0.5);
run;
data work.readable;
set work.domicile;
format domicile domicile. age agecat.;
run;
/* -- end code -- */
On Tue, 15 Oct 2002 13:22:50 +0100, Richard Simhon
<Richard.Simhon@CORNHILL.CO.UK> wrote:
>SAS-lers
>
>Not sure if this got sent out yesterday if not please read on.
>
>I am looking to create a data set with randomly selected criteria.
>
>For example, if I am looking to create a data set with 30,000 observations
>containing a variable of peoples ages ranging from 17 to 90 and using those
>ages I would like to create a second variable containing a value 1 to 5
>(representing the type of house that they might live in) i.e 1 represents
>flats, 2 represents a semi-detached house, 3 represents a detached house, 4
>represents a bungalow, and 5 represents a chalet.
>
>In addition I would like to skew the data set from the point of view that a
>greater proportion of young people will live in a flat or semi detached
>property rather than a detached house or a bungalow. Equally, the older
>generation are more likely to live in a detached house or a bungalow.
>
>Would anyone have any code or pointers that might generate this.
>
>TIA
>
>Richard
>
>
>Richard Simhon
>Business Analyst
>Allianz Cornhill
>Tel 01483 55 2628
>
>
>
>**********************************************************************
>Copyright in this message and any attachments remains with us. It is
>confidential and may be legally privileged. If this message is not
>intended for you it must not be read, copied or used by you or
>disclosed to anyone else. Please advise the sender immediately if
>you have received this message in error.
>
>Although this message and any attachments are believed to be free of
>any virus or other defect that might affect any computer system into
>which it is received and opened it is the responsibility of the
>recipient to ensure that it is virus free and no responsibility
>is accepted by Cornhill Insurance PLC for any loss or damage in any
>way arising from its use.
>
>Cornhill Insurance Plc, Registered in England number 84638,
>Registered Office 32 Cornhill, London EC3V 3LJ.
>
>Member of the General Insurance Standards Council for general insurance
>business.
>**********************************************************************