LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (December 2010, week 4)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Thu, 23 Dec 2010 14:47:02 -0500
Reply-To:     Jerry <i89rt5@GMAIL.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Jerry <i89rt5@GMAIL.COM>
Subject:      Re: Collapsing date records with no gaps
Content-Type: text/plain; charset=ISO-8859-1

Gee, this weird thing is not going away. As shown in the log of EG 4.2, "want" data still have 9 records, which should not occur. Anyone has run into this problem before?

1 The SAS System 14:33 Thursday, December 23, 2010

1 ;*';*";*/;quit;run; 2 OPTIONS PAGENO=MIN; 3 %LET _CLIENTTASKLABEL='coverage_DOW-LookAhead'; 4 %LET _CLIENTPROJECTPATH='Z:\mmr_retroactivity\mmr_retro.egp'; 5 %LET _CLIENTPROJECTNAME='mmr_retro.egp'; 6 %LET _SASPROGRAMFILE=; 7 8 ODS _ALL_ CLOSE; 9 OPTIONS DEV=ACTXIMG; 10 GOPTIONS XPIXELS=0 YPIXELS=0; 11 FILENAME EGHTML TEMP; 12 ODS HTML(ID=EGHTML) FILE=EGHTML ENCODING='utf-8' STYLE=Analysis 12 ! STYLESHEET=(URL="file:///C:/Program%20Files/SAS/SharedFiles/BIClientStyles/4.2/ 12 ! Analysis.css") NOGTITLE NOGFOOTNOTE GPATH=&sasworklocation; NOTE: Writing HTML(EGHTML) Body file: EGHTML 13 FILENAME EGSR TEMP; 14 ODS tagsets.sasreport12(ID=EGSR) FILE=EGSR STYLE=Statistical 14 ! STYLESHEET=(URL="file:///C:/Program%20Files/SAS/SharedFiles/BIClientStyles/4.2/ 14 ! Statistical.css") NOGTITLE NOGFOOTNOTE GPATH=&sasworklocation ENCODING=UTF8 14 ! options(rolap="on"); NOTE: Writing TAGSETS.SASREPORT12(EGSR) Body file: EGSR 15 16 GOPTIONS ACCESSIBLE; 17 data want; 18 do until(last.id); 19 set have(rename=(start_date=_start stop_date=_stop)); 20 by id; 21 22 if first.id then do; 23 start_date=_start; 24 stop_date=_stop; 25 end; 26 27 else if _start<=stop_date+1 then do; 28 if _stop > stop_date then stop_date = _stop; 29 end; 30 31 else do; 32 output; 33 start_date=_start; 34 stop_date=_stop; 35 end; 36 37 if last.id then output; 38 39 end; /*end do until*/ 40 41 format start_date stop_date yymmdd10.; 42 drop _: ; 43 44 run;

NOTE: There were 9 observations read from the data set WORK.HAVE. NOTE: The data set WORK.WANT has 9 observations and 3 variables. NOTE: DATA statement used (Total process time): real time 0.00 seconds cpu time 0.00 seconds

45 46 47 GOPTIONS NOACCESSIBLE; 48 %LET _CLIENTTASKLABEL=; 49 %LET _CLIENTPROJECTPATH=; 50 %LET _CLIENTPROJECTNAME=; 51 %LET _SASPROGRAMFILE=; 52 53 ;*';*";*/;quit;run; 54 ODS _ALL_ CLOSE; 55 56 57 QUIT; RUN; 58

On Thu, 23 Dec 2010 13:58:08 -0500, Jerry <i89rt5@GMAIL.COM> wrote:

>By "marked", do you mean the code which is selected and thus highlighted in >dark blue? > >I did highlighted Dan's code in my EG project and choose "Run Selection" >from the menu to run it. I think this is the way to selectively run his >code, right? > >On Thu, 23 Dec 2010 12:50:34 -0500, Gerhard Hellriegel ><gerhard.hellriegel@T-ONLINE.DE> wrote: > >>do you have something marked in a editor in EG? In that case only the >>marked code might be executed. >> >>Gerhard >> >> >> >> >>On Thu, 23 Dec 2010 12:40:25 -0500, Jerry <i89rt5@GMAIL.COM> wrote: >> >>>Art, >>> >>>Thanks for testing it. >>> >>>I don't think I was hallucinating. But when I run Dan's code below on >>>Enterprise Miner 4.2, sometimes the output data "want" is exactly same as >>>the input data "have", as if the code was not run, But sometime this weird >>>thing did not occur when I ran the code. I could not consistently >>replicate >>>this weird thing. >>> >>>I have never seen this happen, but I only started using EG a bit over one >>>month ago. In any event, this makes a me a bit wary about EG. >>> >>>Maybe I need to start a new project file or a new program file for testing >>>the code? >>> >>>/******Dan's code**********/ >>>data have; >>> input id start_date :mmddyy10. stop_date :mmddyy10.; >>> format start_date stop_date yymmdd10.; >>>cards; >>>1 1/1/10 1/30/10 >>>1 1/30/10 2/15/10 >>>1 2/13/10 3/15/10 >>>1 3/16/10 3/30/10 >>>2 5/1/10 5/30/10 >>>2 7/1/10 7/15/10 >>>3 7/15/10 7/30/10 >>>4 5/1/10 7/30/10 >>>4 7/1/10 7/15/10 >>>; >>>run; >>>proc print; >>>run; >>> >>>data want1; >>> do until(last.id); >>> set have(rename=(start_date=_start stop_date=_stop)); >>> by id; >>> >>> if first.id then do; >>> start_date=_start; >>> stop_date=_stop; >>> end; >>> else if _start<=stop_date+1 then do; >>> if _stop > stop_date then stop_date = _stop; >>> end; >>> else do; >>> output; >>> start_date=_start; >>> stop_date=_stop; >>> end; >>> if last.id then output; >>> end; /*end do until*/ >>> >>>format start_date stop_date yymmdd10.; >>>drop _: ; >>>run; >>> >>> >>> >>> >>>On Thu, 23 Dec 2010 11:30:24 -0500, Arthur Tabachneck <art297@ROGERS.COM> >>wrote: >>> >>>>Jerry, >>>> >>>>I haven't looked back to see what problem the code is trying to solve, >>but >>>>Dan's code takes 9 records and outputs 5 records. That is, file want is >>>>definitely different than file have. >>>> >>>>Art >>>>-------- >>>>On Thu, 23 Dec 2010 10:38:07 -0500, Jerry <i89rt5@GMAIL.COM> wrote: >>>> >>>>>Dan, >>>>> >>>>>I'm not sure what I am missing. But running your code of "corrected >>>>version" >>>>>(see bottom part). The input data set "have" is the exactly same >>as "want". >>>>>I'm lost.....Can you/anyone help? >>>>> >>>>>Sorry for digging out this old thread , but it's very valuable! Sterling >>>>>Paramore pointed me to this thread. Sterling, thank you! >>>>> >>>>>Thanks. >>>>> >>>>>On Fri, 29 Oct 2010 13:33:57 -0700, Nordlund, Dan (DSHS/RDA) >>>>><NordlDJ@DSHS.WA.GOV> wrote: >>>>> >>>>>>> -----Original Message----- >>>>>>> From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of >>>>>>> Søren Lassen >>>>>>> Sent: Friday, October 29, 2010 1:31 AM >>>>>>> To: SAS-L@LISTSERV.UGA.EDU >>>>>>> Subject: Re: Collapsing date records with no gaps >>>>>>> >>>>>>> David, >>>>>>> A large number of solutions have been proposed. Most of them >>>>>>> are variations on a DOW loop. I did not study the other solutions >>>>>>> in detail, but it seems to me that all the DOW solutions proposed >>>>>>> will output nothing if there is only one record for an ID. >>>>>>> >>>>>>> My suggested solution is this: >>>>>>> >>>>>>> proc sort data=have; >>>>>>> by id start_date stop_date; >>>>>>> /* Get rid of zero/negative length intervals */ >>>>>>> where stop_date ge start_date or stop_date is null; >>>>>>> run; >>>>>>> >>>>>>> data want (keep=id start_date stop_date); >>>>>>> set have(rename=(start_date=_start stop_date=_stop)); >>>>>>> by id; >>>>>>> format start_date stop_date mmddyy8.; >>>>>>> retain start_date stop_date; >>>>>>> select; >>>>>>> when(first.id) do; >>>>>>> start_date=_start; >>>>>>> stop_date=_stop; >>>>>>> end; >>>>>>> when(missing(stop_date)); /* never stops, "infinite" stop date */ >>>>>>> when(stop_date ge _stop); /* nested intervals */ >>>>>>> when (stop_date ge _start-1) do; /* adjacent or overlappping */ >>>>>>> if missing(_stop) then >>>>>>> stop_date=.; >>>>>>> else >>>>>>> stop_date = max(_stop,stop_date); >>>>>>> end; >>>>>>> otherwise do; >>>>>>> output; >>>>>>> start_date=_start; >>>>>>> stop_date=_stop; >>>>>>> end; >>>>>>> end; >>>>>>> if last.id then output; >>>>>>> drop _:; >>>>>>> run; >>>>>>> >>>>>>> As you can see, I left out the DOW loop and put in a retain statement >>>>>>> instead. This is mostly a matter of personal taste, likewise the >>>>>>> use of SELECT instead of nested IF..THEN..ELSE statements. >>>>>>> >>>>>>> I added an extra twist: if stop_date is missing, it probably means >>>>>>> that the record is still valid (supposedly into eternity). >>>>>>> >>>>>>> Regards, >>>>>>> Søren >>>>>>> >>>>>> >>>>>>Sören, >>>>>> >>>>>>You are correct that my original solution did not handle the case of a >>>>>single record for an id. I offer a corrected version which also handles >>>>>overlapping and nested spans in addition to the case where a span begins >>>>the >>>>>following day. It could easily be modified to handle the case where a >>stop >>>>>_date was missing, but at first blush I would probably handle it >>>>differently >>>>>than you if there were a following record for the id with a non-missing >>>>>stop_date. But that brings me back to a comment that I made in >>response to >>>>>Sterling. And that is, what to do with spans that overlap in various >>ways >>>>>(or have missing values) depends on local business rules. It is >>>>unfortunate >>>>>that the OP hasn't responded with clarification or told us how the >>problem >>>>>was solved (if I missed that response, I apologize). But anyway, here >>is >>>>my >>>>>modification. >>>>>> >>>>>>data have; >>>>>> input id start_date :mmddyy10. stop_date :mmddyy10.; >>>>>> format start_date stop_date yymmdd10.; >>>>>>cards; >>>>>>1 1/1/10 1/30/10 >>>>>>1 1/30/10 2/15/10 >>>>>>1 2/13/10 3/15/10 >>>>>>1 3/16/10 3/30/10 >>>>>>2 5/1/10 5/30/10 >>>>>>2 7/1/10 7/15/10 >>>>>>3 7/15/10 7/30/10 >>>>>>4 5/1/10 7/30/10 >>>>>>4 7/1/10 7/15/10 >>>>>>; >>>>>>run; >>>>>>proc print; >>>>>>run; >>>>>> >>>>>>data want; >>>>>> do until(last.id); >>>>>> set have(rename=(start_date=_start stop_date=_stop)); >>>>>> by id; >>>>>> >>>>>> if first.id then do; >>>>>> start_date=_start; >>>>>> stop_date=_stop; >>>>>> end; >>>>>> else if _start<=stop_date+1 then do; >>>>>> if _stop > stop_date then stop_date = _stop; >>>>>> end; >>>>>> else do; >>>>>> output; >>>>>> start_date=_start; >>>>>> stop_date=_stop; >>>>>> end; >>>>>> if last.id then output; >>>>>> end; /*end do until*/ >>>>>> >>>>>>format start_date stop_date yymmdd10.; >>>>>>drop _: ; >>>>>>run; >>>>>> >>>>>>proc print; >>>>>>run; >>>>>> >>>>>>Hope this is helpful, >>>>>> >>>>>>Dan >>>>>> >>>>>>Daniel J. Nordlund >>>>>>Washington State Department of Social and Health Services >>>>>>Planning, Performance, and Accountability >>>>>>Research and Data Analysis Division >>>>>>Olympia, WA 98504-5204


Back to: Top of message | Previous page | Main SAS-L page