Hello Richard Welcome ... I have found the need to simulate data on almost a monthly basis to test this or that. I would set it up in a couple of steps... * First i would set up one or a few formats. One format you might want is the housing type, ie flat, etc.. Another might be your age categories just for readability. * I would first simulate your ages, that is simulate your population by age using whatever distribution you would like or need. We have ranuni, ranexp, ran-whatever... so it all depends what your population "should" look like and what variance therefrom you want to look at. Might be a good idea to get this done as a separate step before we jump on the domicile. * Next step using categories on the age, I would use rantbl() since you wanted to include age categories as a little factor on changes in probability of domicile. I would start doing your age catagories as if-statements to begin with since it can get quite messy and then move on to some funny algorithms if you feel the need. HTH Magnus /*------- some very untested code ---------------*/ proc format library=work; value domicile 1 = 'flat' 2 = 'house' 3 = 'bungalow'; value agecat low - 35 = 'say we are young' 35 - 55 = 'two BMW and a house with 1 dog and a 2 kids' 55-high = 'we can get back at our parents for spoiling the kids'; run; data ages; do subject = 1 to 30000; /* -- RANUNI() -- all ages have the same probability of occuring -- */ age = floor(17 + 73*ranuni(date())); output; end; run; data work.domicile; set work.ages; /* -- say we are young until we are 35 -- */ if (17 <= age < 35) then domicile = rantbl(date(), 0.5, 0.25, 0.25); /* -- two BMW and a house with 1 dog and a 2 kids -- */ if (35 <= age < 55) then domicile = rantbl(date(), 0.25, 0.5, 0.25); /* -- the golden years... we can get back at our parents for spoiling the kids -- */ if (55 <= age <= 90) then domicile = rantbl(date(), 0.25, 0.25, 0.5); run; data work.readable; set work.domicile; format domicile domicile. age agecat.; run; /* -- end code -- */ On Tue, 15 Oct 2002 13:22:50 +0100, Richard Simhon wrote: >SAS-lers > >Not sure if this got sent out yesterday if not please read on. > >I am looking to create a data set with randomly selected criteria. > >For example, if I am looking to create a data set with 30,000 observations >containing a variable of peoples ages ranging from 17 to 90 and using those >ages I would like to create a second variable containing a value 1 to 5 >(representing the type of house that they might live in) i.e 1 represents >flats, 2 represents a semi-detached house, 3 represents a detached house, 4 >represents a bungalow, and 5 represents a chalet. > >In addition I would like to skew the data set from the point of view that a >greater proportion of young people will live in a flat or semi detached >property rather than a detached house or a bungalow. Equally, the older >generation are more likely to live in a detached house or a bungalow. > >Would anyone have any code or pointers that might generate this. > >TIA > >Richard > > >Richard Simhon >Business Analyst >Allianz Cornhill

