From: Richard A. DeVenezia on
On Mar 9, 5:18 pm, need_sas_h...(a)YAHOO.COM (Tom Smith) wrote:
> I have a following dataset with four variable (SID1A, TRTEND1O, CYCLE,
> SMDDOS1D)
>
> SID1A           TRTEND1O    CYCLE     SMDDOS1D
> ----------      ---------   -----    ----------
> 0012_00002      20DEC2006      1      08NOV2006
> 0012_00002      20DEC2006      2      29NOV2006
> 0012_00002      20DEC2006      3      20DEC2006
> 0194_00001      20DEC2006      1      03MAY2007
> 0194_00001      21OCT2007      2      28MAY2007
> 0194_00001      21OCT2007      3      18JUN2007
> 0194_00001      21OCT2007      4      09JUL2007
> 0194_00001      21OCT2007      5      01AUG2007
> 0194_00001      21OCT2007      6      22AUG2007
> 0194_00001      21OCT2007      7      10SEP2007
>
> Need to have a final result dataset where start_dt and stop_dt is going to
> come from SMDDOS1D ( BY 'CYCLE' variable), stop_dt is going to be always
> 1 day less from the next observation's SMDDOS1D variable's value, and the
> last stop_dt of each SID1A should come from the value of TRT_END variable..
> The Result dataset should be as below:
>
> SID1A           TRT_END    CYCLE     SMDDOS1D       start_dt    stop_dt
> ----------      ---------   -----    ----------      --------    -------
> 0012_00002      20DEC2006      1      08NOV2006     08NOV2006    28NOV2006
> 0012_00002      20DEC2006      2      29NOV2006     29NOV2006    19DEC2006
> 0012_00002      20DEC2006      3      20DEC2006     20DEC2006    20DEC2006
> 0194_00001      20DEC2006      1      03MAY2007     03MAY2007    27MAY2007
> 0194_00001      21OCT2007      2      28MAY2007     28MAY2007    17JUN2007
> 0194_00001      21OCT2007      3      18JUN2007     18JUN2007    08JUL2007
> 0194_00001      21OCT2007      4      09JUL2007     09JUL2007    31JUL2007
> 0194_00001      21OCT2007      5      01AUG2007     01AUG2007    21AUG2007
> 0194_00001      21OCT2007      6      22AUG2007     22AUG2007    09SEP2007
> 0194_00001      21OCT2007      7      10SEP2007     10SEP2007    20OCT2007
>
> Thanks you so much, without your help I wouldn't be solve this whole
> problem.

Tom:

Here are two ways. There are many other ways as well.

data have;
length SID1A $10;
length TRTEND1O CYCLE SMDDOS1D 4 ;
informat TRTEND1O SMDDOS1D date9.;
format TRTEND1O SMDDOS1D date9.;
input SID1A TRTEND1O CYCLE SMDDOS1D ;
datalines;
0012_00002 20DEC2006 1 08NOV2006
0012_00002 20DEC2006 2 29NOV2006
0012_00002 20DEC2006 3 20DEC2006
0194_00001 20DEC2006 1 03MAY2007
0194_00001 21OCT2007 2 28MAY2007
0194_00001 21OCT2007 3 18JUN2007
0194_00001 21OCT2007 4 09JUL2007
0194_00001 21OCT2007 5 01AUG2007
0194_00001 21OCT2007 6 22AUG2007
0194_00001 21OCT2007 7 10SEP2007
run;


* works, but not robust enough to handle disordered data or cycle
gaps;

data final;

merge
have
have (firstobs=2 keep=SID1A smddos1d rename=smddos1d=endperiod);
;

endperiod=max(smddos1d,endperiod-1);

by SID1A;
run;


* works, more robust when disordered data or cycle gaps;

proc sql;
create table final2 as
select
one.*,
max (one.smddos1d, two.smddos1d-1) as endperiod format=date9.
from
have as one
left join
have as two
on
one.SID1A = two.SID1A
& one.cycle = two.cycle-1
order by
one.SID1A
, one.cycle
;

--
Richard A. DeVenezia
http://www.devenezia.com