LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (November 2008, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Mon, 10 Nov 2008 21:34:35 -0500
Reply-To:   "Howard Schreier <hs AT dc-sug DOT org>" <schreier.junk.mail@GMAIL.COM>
Sender:   "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:   "Howard Schreier <hs AT dc-sug DOT org>" <schreier.junk.mail@GMAIL.COM>
Subject:   Re: temporal cluster

On Mon, 10 Nov 2008 10:48:25 +0000, Tracy Clegg <tracy.clegg@UCD.IE> wrote:

>Thanks to Peter and Bruce for your thoughts on this. > >To shed a little more light on what we are trying to do: The measurements >are hourly concentration of a gas detected on a farm. The only thing it >would correlate with would be weather patterns which we don't have >information on. As Peter suggested identifying a 'high cluster' could be >done by identifying where a number of points occurred in a row above/below a >given value (eg. 2 standard deviations above the mean). Something like this >might work, but it would have to somehow allow for low values occurring >within the cluster. Some periods of time show very high values interspersed >with very low values. That's the whole problem. We'd like to know if a (eg) >7 day period had significantly high overall values even though it contained >low values. We thought of using a temporal sliding window that would move >through time summing at given time-lengths. Then identify windows of high >concentrations of the gas in total. The problem is how to do this with >varying sizes of the window and does anyone know of any software that we can >use to do this?

Do you have SAS/ETS? If so, look at PROC EXPAND. It's good for computing moving averages and the like.

> >Thanks again for your help > >Tracy > >-----Original Message----- >From: Peter Flom [mailto:peterflomconsulting@mindspring.com] >Sent: 06 November 2008 17:35 >To: Tracy Clegg; SAS-L@LISTSERV.UGA.EDU >Subject: Re: temporal cluster > >Tracy Clegg <tracy.clegg@UCD.IE> wrote >> >>I have hourly measurements of a continuous variable (concentration of a >>gas) measured over 5 years. Could anyone tell me how to identify >>clusters where the continuous variable was unusually high over a certain >>period of time? > > >I think cluster is the wrong term here, and may put people in mind of >cluster analysis, which isn't what you want. > >Some questions > >What do you mean by "unusually high"? >Does the value change over time, other than randomly, and if so, how? >How are the values distributed over time? >Is there autocorrelation? > >Some ways that you *might* want to do this: > Find the mean value over the 5 years. Define 'unusual' as more than 2 sd >above that. > Fit a loess curve, define 'unusual' as some distance above the predicted >value > Fit fsome other curve, define 'ususual' as some distance above that. > >etc. > >but you also want 'clusters'. If there is a lot of autocorrelation, then >identifying one very high point will likely identify others that are almost >as high. > >Or you could say that a 'high cluster' is XXXX points in a row that are XXXX >above a curve based on XXX > > >If there values, over time, fit some distribution well, then you could find >some definition of outlier based on the number of points and the nature of >the distribution. > >HTH or at least gives you some ideas > >Peter > > > >Peter L. Flom, PhD >Statistical Consultant >www DOT peterflom DOT com


Back to: Top of message | Previous page | Main SAS-L page