LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (March 2011, week 1)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Fri, 4 Mar 2011 12:21:48 -0500
Reply-To:   Suzanne McCoy <Suzanne.McCoy@CATALINAMARKETING.COM>
Sender:   "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:   Suzanne McCoy <Suzanne.McCoy@CATALINAMARKETING.COM>
Subject:   Re: std deviation question
Comments:   To: Peter Flom <peterflomconsulting@mindspring.com>
In-Reply-To:   <009301cbda8f$6e121b50$4a3651f0$@mindspring.com>
Content-Type:   text/plain; charset="us-ascii"

Thanks Peter, that's exactly what I needed for the conversation with the business user

-----Original Message----- From: Peter Flom [mailto:peterflomconsulting@mindspring.com] Sent: Friday, March 04, 2011 12:13 PM To: Suzanne McCoy; SAS-L@LISTSERV.UGA.EDU Subject: RE: std deviation question

Suzanne McCoy wrote

>>>The business user spec requests the value be calculated as std(median_nbr_trips). Do I question the spec or just calc the number and not worry about it? I know it is mathematically incorrect >>>to average averages so was just curious if taking the standard deviation of an average was okay. This is only for estimation purposes across a group of similar shoppers.

Well, you could say "that's what they are paying for, that's what I'll give them" but it does seem an odd thing. I'm guessing nbr_trips is number of trips to, say, a supermarket. Now, suppose we have Joe, Bill and Sue. Over a period of 4 weeks, Joe takes 0, 0, 3 and 2 trips. Bill takes 5, 6, 6, and 7. Sue takes 1, 1, 1, and 1. This gives medians of 2, 6 and 1. Then you find the sd of those and .... what? How is this useful? USUALLY, if you want to take a median, rather than a mean, you want an interquartile range, or maybe median absolute deviation, rather than an SD.

I *think* what the user might want is the median over all, and then the IQR overall.

Even worse, suppose each person has data for a different number of weeks ....

Also, it's not mathematically wrong to take the average of averages, it's just that the result is not what most people doing such a thing would want.

Peter


Back to: Top of message | Previous page | Main SAS-L page