Date: Fri, 19 Dec 2008 15:52:34 -0500
Reply-To: Peter Flom <email@example.com>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Peter Flom <peterflomconsulting@MINDSPRING.COM>
Subject: Re: Aggregate and individual-level data analysis
Content-Type: text/plain; charset=UTF-8
> Yet another question from me.
> Suppose I have 2 datasets:
> 1. Dataset1--Contains individual-level data on who bought food at a
> concession stand during a football game
> 2. Dataset2--Contains aggregate data on prevalence of obesity (bmi >=30)
> and overweight (bmi >=25) by zip code
> Dataset1 looks roughly like this:
> name zip code John 78530 Jane 78531 Angie 78532 Eileen 78530 Tim 78530
> Bob 78532
> et cetera...let's say there are 3000 people in this dataset, all of these
> people bought food.
> Dataset2 looks roughly like this:
> zip code overwt obese 78530 500 200 78531 600 500 78532 100 50
> Supposing I wanted to know if there was a correlation between buying food
> and obesity, what procedure can I run? Notice that overweight and obese
> are BMI classifications, so really, Dataset2 represents data from 1950
> respondents. I get a feeling that I need to disaggregate Dataset2, because
> I was kicking myself in the head when I tried to turn Dataset1 into an
> aggregate dataset, and finding it impossible (and stupid) to try to plot the
> As always, I welcome and appreciate any suggestions on how to tackle this.
I don't think there is much you can do here .... because everyone in dataset 1 bought food.
But maybe ....
Are the people in data set 1 a random sample from each ZIP code?
Do you know the total population of each ZIP code?
Do you know what percentage of people buy food? (I'm thinking it's close to everyone, one way or another .... How does someone never buy food?)
Peter L. Flom, PhD
www DOT peterflom DOT com