From: Sierra Information Services on 11 Nov 2009 14:01 Could someone please help me with the following LOCF problem? I am sure it is something simple, but it eludes me at present. I have a time series (see sample below) where some values of the series are missing. What I want to do is replace the missing values in the series with the last NON-MISSING Value Here is what I HAVE: 4/1/1968 37.7 4/2/1968 37.3 4/3/1968 37.6 4/4/1968 36.95 4/5/1968 37 4/6/1968 . 4/7/1968 . 4/8/1968 37.05 4/9/1968 37.5 4/10/1968 37.7 4/11/1968 38 4/12/1968 . 4/13/1968 . 4/14/1968 . 4/15/1968 . 4/16/1968 38 What I WANT is to replace the missing values for 4/6/1968 and 4/7/1968 with 37 and the missing values for 4/12 through 4/15 with 38 The data set from which I am working has about 30 years worth of daily data in it. Any suggestions? Thanks! Andrew Karp Sierra Information Services www.SierraInformation.com
From: Joe Matise on 11 Nov 2009 14:14 Maybe I'm reading it wrong, but if I'm not missing something, this should work: data have; input @1 dateval MMDDYY10. @17 value BEST12.; datalines; 4/1/1968 37.7 4/2/1968 37.3 4/3/1968 37.6 4/4/1968 36.95 4/5/1968 37 4/6/1968 . 4/7/1968 . 4/8/1968 37.05 4/9/1968 37.5 4/10/1968 37.7 4/11/1968 38 4/12/1968 . 4/13/1968 . 4/14/1968 . 4/15/1968 . 4/16/1968 38 ;;;; run; data want; set have; retain _val; if not missing(value) then _val=value; else value=_val; drop _val; run; -Joe On Wed, Nov 11, 2009 at 1:01 PM, Sierra Information Services < sfbay0001(a)aol.com> wrote: > Could someone please help me with the following LOCF problem? I am > sure it is something simple, but it eludes me at present. > > I have a time series (see sample below) where some values of the > series are missing. > > What I want to do is replace the missing values in the series with the > last NON-MISSING Value > > Here is what I HAVE: > 4/1/1968 37.7 > 4/2/1968 37.3 > 4/3/1968 37.6 > 4/4/1968 36.95 > 4/5/1968 37 > 4/6/1968 . > 4/7/1968 . > 4/8/1968 37.05 > 4/9/1968 37.5 > 4/10/1968 37.7 > 4/11/1968 38 > 4/12/1968 . > 4/13/1968 . > 4/14/1968 . > 4/15/1968 . > 4/16/1968 38 > > What I WANT is to replace the missing values for 4/6/1968 and 4/7/1968 > with 37 and the missing values > for 4/12 through 4/15 with 38 > > The data set from which I am working has about 30 years worth of daily > data in it. > > Any suggestions? > > Thanks! > > Andrew Karp > Sierra Information Services > www.SierraInformation.com >
From: Robin R High on 11 Nov 2009 15:17 Though DATA step code works well with these data, it could also be helpful to explore the capabilities of some ETS procedures (if available) to do the same task: proc timeseries data=infile2 out=outfile2a; id date interval=day setmissing=prev; var yy; run; proc print data=outfile2a; run; proc expand data=infile2 out=outfile2b from=day; id date; convert yy / method=step; run; * The STEP method fits a discontinuous piecewise constant curve. For point-in-time input data, the resulting step function is equal to the most recent input value. ; PROC PRINT data=outfile2b NOObs; FORMAT date mmddyy10. ; RUN; The "extra nice" feature these two PROCs have (that could be a bit difficult with a DATA step) is they work the same way if the missing data are not there at all, i.e., if the input file looks like this: data infile2; format date mmddyy10.; informat date mmddyy10.; input date yy; datalines; 4/1/1968 37.7 4/2/1968 37.3 4/3/1968 37.6 4/4/1968 36.95 4/5/1968 37 4/8/1968 37.05 4/9/1968 37.5 4/10/1968 37.7 4/11/1968 38 4/16/1968 38 ; Robin High UNMC From: Sierra Information Services <sfbay0001(a)AOL.COM> To: SAS-L(a)LISTSERV.UGA.EDU Date: 11/11/2009 01:17 PM Subject: LOCF (Last Obs Carried Forward) Help, Please Sent by: "SAS(r) Discussion" <SAS-L(a)LISTSERV.UGA.EDU> Could someone please help me with the following LOCF problem? I am sure it is something simple, but it eludes me at present. I have a time series (see sample below) where some values of the series are missing. What I want to do is replace the missing values in the series with the last NON-MISSING Value Here is what I HAVE: 4/1/1968 37.7 4/2/1968 37.3 4/3/1968 37.6 4/4/1968 36.95 4/5/1968 37 4/6/1968 . 4/7/1968 . 4/8/1968 37.05 4/9/1968 37.5 4/10/1968 37.7 4/11/1968 38 4/12/1968 . 4/13/1968 . 4/14/1968 . 4/15/1968 . 4/16/1968 38 What I WANT is to replace the missing values for 4/6/1968 and 4/7/1968 with 37 and the missing values for 4/12 through 4/15 with 38 The data set from which I am working has about 30 years worth of daily data in it. Any suggestions? Thanks! Andrew Karp Sierra Information Services www.SierraInformation.com
From: Sigurd Hermansen on 11 Nov 2009 17:11 Andrew: D�j� vu all over again ... I had a similar request come to me a few weeks ago. While I share Robin's preference for using the SAS/ETS methods, some situations don't allow for that. Some do not have SAS/ETS licensed. In my case, the requestor specified carrying over the prior value rather than an estimate of the next value in the series. Even though I assumed that a Data step solution would work best for a physical sequencing problem, my stubborn attempts to try a SQL solution first progressively reduced complexity of the problem to something simple (that is, in SAS SQL with the REMERGE method): data test; input date: mmddyy10. ob best. ; datalines; 4/1/1968 37.7 4/2/1968 37.3 4/3/1968 37.6 4/4/1968 36.95 4/5/1968 37 4/6/1968 . 4/7/1968 . 4/8/1968 37.05 4/9/1968 37.5 4/10/1968 37.7 4/11/1968 38 4/12/1968 . 4/13/1968 . 4/14/1968 . 4/15/1968 . 4/16/1968 38 ; run; /* Reflexive query solution. SWH */ proc sql; create table result as select distinct r1.date,coalesce(r1.ob,r2.ob) as ob from test as r1 left join (select * from test where NOT ob IS NULL) as r2 on r1.date>=r2.date group by r1.date having (r1.date - r2.date) = min(r1.date - r2.date) order by date ; quit; The first constraint limits selections of r1.ob or r2.ob to tuples r1,r2 that have r1.date >= r2.date; the second constraint limits selections of r2.ob to the tuple with r2.date closest to r1.date; r2.ob does not have missing values in the yield of the subquery; the GROUP BY clause looks only at groups of r1,r2 tuples with the same r1.date; the COALESCE() function selects a value from r2.ob instead of a missing value in r1.ob, else r1.ob. Call me Spock if you must: I can't resist the logical solution. S -----Original Message----- From: SAS(r) Discussion [mailto:SAS-L(a)LISTSERV.UGA.EDU] On Behalf Of Sierra Information Services Sent: Wednesday, November 11, 2009 2:02 PM To: SAS-L(a)LISTSERV.UGA.EDU Subject: LOCF (Last Obs Carried Forward) Help, Please Could someone please help me with the following LOCF problem? I am sure it is something simple, but it eludes me at present. I have a time series (see sample below) where some values of the series are missing. What I want to do is replace the missing values in the series with the last NON-MISSING Value Here is what I HAVE: 4/1/1968 37.7 4/2/1968 37.3 4/3/1968 37.6 4/4/1968 36.95 4/5/1968 37 4/6/1968 . 4/7/1968 . 4/8/1968 37.05 4/9/1968 37.5 4/10/1968 37.7 4/11/1968 38 4/12/1968 . 4/13/1968 . 4/14/1968 . 4/15/1968 . 4/16/1968 38 What I WANT is to replace the missing values for 4/6/1968 and 4/7/1968 with 37 and the missing values for 4/12 through 4/15 with 38 The data set from which I am working has about 30 years worth of daily data in it. Any suggestions? Thanks! Andrew Karp Sierra Information Services www.SierraInformation.com
From: Fernández Rodríguez, on 12 Nov 2009 05:24 Other fast methos using automatic _iorc_ varname, But I am not really sure if it could bring conflict: * automatic _iorc_ variable, in this way we avoid to remember to drop the flag variable at the end; data need; set have; if value NE (.) then _iorc_=value; else value=_iorc_; run; Daniel Fernandez Barcelona. -----Mensaje original----- De: SAS(r) Discussion [mailto:SAS-L(a)LISTSERV.UGA.EDU] En nombre de Sigurd Hermansen Enviado el: mi�rcoles, 11 de noviembre de 2009 23:12 Para: SAS-L(a)LISTSERV.UGA.EDU Asunto: Re: LOCF (Last Obs Carried Forward) Help, Please Andrew: D�j� vu all over again ... I had a similar request come to me a few weeks ago. While I share Robin's preference for using the SAS/ETS methods, some situations don't allow for that. Some do not have SAS/ETS licensed. In my case, the requestor specified carrying over the prior value rather than an estimate of the next value in the series. Even though I assumed that a Data step solution would work best for a physical sequencing problem, my stubborn attempts to try a SQL solution first progressively reduced complexity of the problem to something simple (that is, in SAS SQL with the REMERGE method): data test; input date: mmddyy10. ob best. ; datalines; 4/1/1968 37.7 4/2/1968 37.3 4/3/1968 37.6 4/4/1968 36.95 4/5/1968 37 4/6/1968 . 4/7/1968 . 4/8/1968 37.05 4/9/1968 37.5 4/10/1968 37.7 4/11/1968 38 4/12/1968 . 4/13/1968 . 4/14/1968 . 4/15/1968 . 4/16/1968 38 ; run; /* Reflexive query solution. SWH */ proc sql; create table result as select distinct r1.date,coalesce(r1.ob,r2.ob) as ob from test as r1 left join (select * from test where NOT ob IS NULL) as r2 on r1.date>=r2.date group by r1.date having (r1.date - r2.date) = min(r1.date - r2.date) order by date ; quit; The first constraint limits selections of r1.ob or r2.ob to tuples r1,r2 that have r1.date >= r2.date; the second constraint limits selections of r2.ob to the tuple with r2.date closest to r1.date; r2.ob does not have missing values in the yield of the subquery; the GROUP BY clause looks only at groups of r1,r2 tuples with the same r1.date; the COALESCE() function selects a value from r2.ob instead of a missing value in r1.ob, else r1.ob. Call me Spock if you must: I can't resist the logical solution. S -----Original Message----- From: SAS(r) Discussion [mailto:SAS-L(a)LISTSERV.UGA.EDU] On Behalf Of Sierra Information Services Sent: Wednesday, November 11, 2009 2:02 PM To: SAS-L(a)LISTSERV.UGA.EDU Subject: LOCF (Last Obs Carried Forward) Help, Please Could someone please help me with the following LOCF problem? I am sure it is something simple, but it eludes me at present. I have a time series (see sample below) where some values of the series are missing. What I want to do is replace the missing values in the series with the last NON-MISSING Value Here is what I HAVE: 4/1/1968 37.7 4/2/1968 37.3 4/3/1968 37.6 4/4/1968 36.95 4/5/1968 37 4/6/1968 . 4/7/1968 . 4/8/1968 37.05 4/9/1968 37.5 4/10/1968 37.7 4/11/1968 38 4/12/1968 . 4/13/1968 . 4/14/1968 . 4/15/1968 . 4/16/1968 38 What I WANT is to replace the missing values for 4/6/1968 and 4/7/1968 with 37 and the missing values for 4/12 through 4/15 with 38 The data set from which I am working has about 30 years worth of daily data in it. Any suggestions? Thanks! Andrew Karp Sierra Information Services www.SierraInformation.com
|
Pages: 1 Prev: PDF Reports from Stored Processes Next: %Squeeze Change 9.1.3 to 9.2 |