LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (July 2002, week 4)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Tue, 23 Jul 2002 09:50:41 -0400
Reply-To:     Steve Albert <salbert@AOL.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Steve Albert <salbert@AOL.COM>
Subject:      Re: How to sift data to find Time Gaps

All the responses so far have sought gaps between observations that exceed 60 seconds; however, I don't think that's what the original poster requested:

>I now need to create a process that will show where the time gaps are. > All data should be consecutive for the day. I need a routine that will > validate that there is at least one entry per minute per hour per day > per month. Then produce a report/graph that only shows me where the > data is missing.

So it seems we need to find minutes during which there is no observation, not observations that are more than 60 seconds after the previous one. (So if we have an observation at 12:00:01 and another at 12:01:59, we have observations in both the 12:00 and the 12:01 minutes, even though the time between them exceeded 60 seconds.)

One possible approach: 1. Truncate all times to the beginning of the minute involved. 2. Search for gaps using the LAG approach others have suggested. If the difference exceeds 60 seconds, we know that we have at least one minute with no observation; keeping the difference would let us infer the length of the gap. (Note number of missing minutes is difference in minutes minus one -- a gap of one is OK.) If need be, we could generate a dataset the included every minute that was part of the gap, from the end time of the gap together with the gap length; alternatively, using LAG we could add the last good time onto the record, so we'd generate a set of records with a "previous good minute" and "next good minute" for each gap.

Another possible approach: 1. Truncate all times to the beginning of the minute involved. 2. Generate a dataset consisting only of date times, one observation for each minute over the relevant time period. 3. Merge that dataset against the real dataset; anything that doesn't have a match will then be part of a gap, so keeping the non-match cases produces precisely the set of gap minutes that the poster seemed to originally request. Not as elegant, and probably requiring more CPU time, but perhaps a conceptually easier way to directly produce the required dataset with every missing minute included, for the report and graph requested.

Steve Albert Director of Biostatistics Spectrum Pharmaceutical Research Corp. San Antonio, TX SAlbert at SpectrumCRO dot com


Back to: Top of message | Previous page | Main SAS-L page