From: Sierra Information Services on
Could someone please help me with the following LOCF problem? I am
sure it is something simple, but it eludes me at present.

I have a time series (see sample below) where some values of the
series are missing.

What I want to do is replace the missing values in the series with the
last NON-MISSING Value

Here is what I HAVE:
4/1/1968 37.7
4/2/1968 37.3
4/3/1968 37.6
4/4/1968 36.95
4/5/1968 37
4/6/1968 .
4/7/1968 .
4/8/1968 37.05
4/9/1968 37.5
4/10/1968 37.7
4/11/1968 38
4/12/1968 .
4/13/1968 .
4/14/1968 .
4/15/1968 .
4/16/1968 38

What I WANT is to replace the missing values for 4/6/1968 and 4/7/1968
with 37 and the missing values
for 4/12 through 4/15 with 38

The data set from which I am working has about 30 years worth of daily
data in it.

Any suggestions?

Thanks!

Andrew Karp
Sierra Information Services
www.SierraInformation.com
From: Joe Matise on
Maybe I'm reading it wrong, but if I'm not missing something, this should
work:

data have;
input
@1 dateval MMDDYY10.
@17 value BEST12.;
datalines;
4/1/1968 37.7
4/2/1968 37.3
4/3/1968 37.6
4/4/1968 36.95
4/5/1968 37
4/6/1968 .
4/7/1968 .
4/8/1968 37.05
4/9/1968 37.5
4/10/1968 37.7
4/11/1968 38
4/12/1968 .
4/13/1968 .
4/14/1968 .
4/15/1968 .
4/16/1968 38
;;;;
run;

data want;
set have;
retain _val;
if not missing(value) then _val=value;
else value=_val;
drop _val;
run;

-Joe

On Wed, Nov 11, 2009 at 1:01 PM, Sierra Information Services <
sfbay0001(a)aol.com> wrote:

> Could someone please help me with the following LOCF problem? I am
> sure it is something simple, but it eludes me at present.
>
> I have a time series (see sample below) where some values of the
> series are missing.
>
> What I want to do is replace the missing values in the series with the
> last NON-MISSING Value
>
> Here is what I HAVE:
> 4/1/1968 37.7
> 4/2/1968 37.3
> 4/3/1968 37.6
> 4/4/1968 36.95
> 4/5/1968 37
> 4/6/1968 .
> 4/7/1968 .
> 4/8/1968 37.05
> 4/9/1968 37.5
> 4/10/1968 37.7
> 4/11/1968 38
> 4/12/1968 .
> 4/13/1968 .
> 4/14/1968 .
> 4/15/1968 .
> 4/16/1968 38
>
> What I WANT is to replace the missing values for 4/6/1968 and 4/7/1968
> with 37 and the missing values
> for 4/12 through 4/15 with 38
>
> The data set from which I am working has about 30 years worth of daily
> data in it.
>
> Any suggestions?
>
> Thanks!
>
> Andrew Karp
> Sierra Information Services
> www.SierraInformation.com
>
From: Robin R High on
Though DATA step code works well with these data, it could also be helpful
to explore the capabilities of some ETS procedures (if available) to do
the same task:

proc timeseries data=infile2 out=outfile2a;
id date interval=day setmissing=prev;
var yy;
run;

proc print data=outfile2a; run;


proc expand data=infile2 out=outfile2b from=day;
id date;
convert yy / method=step;
run;

* The STEP method fits a discontinuous piecewise constant curve.
For point-in-time input data, the resulting step function is equal to
the most recent input value. ;

PROC PRINT data=outfile2b NOObs; FORMAT date mmddyy10. ; RUN;


The "extra nice" feature these two PROCs have (that could be a bit
difficult with a DATA step) is they work the same way if the missing data
are not there at all, i.e., if the input file looks like this:


data infile2;
format date mmddyy10.;
informat date mmddyy10.;
input date yy;
datalines;
4/1/1968 37.7
4/2/1968 37.3
4/3/1968 37.6
4/4/1968 36.95
4/5/1968 37
4/8/1968 37.05
4/9/1968 37.5
4/10/1968 37.7
4/11/1968 38
4/16/1968 38
;


Robin High
UNMC





From:
Sierra Information Services <sfbay0001(a)AOL.COM>
To:
SAS-L(a)LISTSERV.UGA.EDU
Date:
11/11/2009 01:17 PM
Subject:
LOCF (Last Obs Carried Forward) Help, Please
Sent by:
"SAS(r) Discussion" <SAS-L(a)LISTSERV.UGA.EDU>



Could someone please help me with the following LOCF problem? I am
sure it is something simple, but it eludes me at present.

I have a time series (see sample below) where some values of the
series are missing.

What I want to do is replace the missing values in the series with the
last NON-MISSING Value

Here is what I HAVE:
4/1/1968 37.7
4/2/1968 37.3
4/3/1968 37.6
4/4/1968 36.95
4/5/1968 37
4/6/1968 .
4/7/1968 .
4/8/1968 37.05
4/9/1968 37.5
4/10/1968 37.7
4/11/1968 38
4/12/1968 .
4/13/1968 .
4/14/1968 .
4/15/1968 .
4/16/1968 38

What I WANT is to replace the missing values for 4/6/1968 and 4/7/1968
with 37 and the missing values
for 4/12 through 4/15 with 38

The data set from which I am working has about 30 years worth of daily
data in it.

Any suggestions?

Thanks!

Andrew Karp
Sierra Information Services
www.SierraInformation.com
From: Sigurd Hermansen on
Andrew:
D�j� vu all over again ... I had a similar request come to me a few weeks ago.

While I share Robin's preference for using the SAS/ETS methods, some situations don't allow for that. Some do not have SAS/ETS licensed. In my case, the requestor specified carrying over the prior value rather than an estimate of the next value in the series.

Even though I assumed that a Data step solution would work best for a physical sequencing problem, my stubborn attempts to try a SQL solution first progressively reduced complexity of the problem to something simple (that is, in SAS SQL with the REMERGE method):

data test;
input date: mmddyy10. ob best.
;
datalines;
4/1/1968 37.7
4/2/1968 37.3
4/3/1968 37.6
4/4/1968 36.95
4/5/1968 37
4/6/1968 .
4/7/1968 .
4/8/1968 37.05
4/9/1968 37.5
4/10/1968 37.7
4/11/1968 38
4/12/1968 .
4/13/1968 .
4/14/1968 .
4/15/1968 .
4/16/1968 38
;
run;

/* Reflexive query solution. SWH */
proc sql;
create table result as
select distinct r1.date,coalesce(r1.ob,r2.ob) as ob
from test as r1 left join (select * from test where NOT ob IS NULL) as r2
on r1.date>=r2.date
group by r1.date
having (r1.date - r2.date) = min(r1.date - r2.date)
order by date
;
quit;

The first constraint limits selections of r1.ob or r2.ob to tuples r1,r2 that have r1.date >= r2.date;
the second constraint limits selections of r2.ob to the tuple with r2.date closest to r1.date;
r2.ob does not have missing values in the yield of the subquery;
the GROUP BY clause looks only at groups of r1,r2 tuples with the same r1.date;
the COALESCE() function selects a value from r2.ob instead of a missing value in r1.ob, else r1.ob.

Call me Spock if you must: I can't resist the logical solution.
S
-----Original Message-----
From: SAS(r) Discussion [mailto:SAS-L(a)LISTSERV.UGA.EDU] On Behalf Of Sierra Information Services
Sent: Wednesday, November 11, 2009 2:02 PM
To: SAS-L(a)LISTSERV.UGA.EDU
Subject: LOCF (Last Obs Carried Forward) Help, Please

Could someone please help me with the following LOCF problem? I am
sure it is something simple, but it eludes me at present.

I have a time series (see sample below) where some values of the
series are missing.

What I want to do is replace the missing values in the series with the
last NON-MISSING Value

Here is what I HAVE:
4/1/1968 37.7
4/2/1968 37.3
4/3/1968 37.6
4/4/1968 36.95
4/5/1968 37
4/6/1968 .
4/7/1968 .
4/8/1968 37.05
4/9/1968 37.5
4/10/1968 37.7
4/11/1968 38
4/12/1968 .
4/13/1968 .
4/14/1968 .
4/15/1968 .
4/16/1968 38

What I WANT is to replace the missing values for 4/6/1968 and 4/7/1968
with 37 and the missing values
for 4/12 through 4/15 with 38

The data set from which I am working has about 30 years worth of daily
data in it.

Any suggestions?

Thanks!

Andrew Karp
Sierra Information Services
www.SierraInformation.com
From: Fernández Rodríguez, on
Other fast methos using automatic _iorc_ varname,
But I am not really sure if it could bring conflict:


* automatic _iorc_ variable, in this way
we avoid to remember to drop the flag variable at the end;

data need;
set have;
if value NE (.) then _iorc_=value;
else value=_iorc_;
run;

Daniel Fernandez
Barcelona.






-----Mensaje original-----
De: SAS(r) Discussion [mailto:SAS-L(a)LISTSERV.UGA.EDU] En nombre de Sigurd Hermansen
Enviado el: mi�rcoles, 11 de noviembre de 2009 23:12
Para: SAS-L(a)LISTSERV.UGA.EDU
Asunto: Re: LOCF (Last Obs Carried Forward) Help, Please

Andrew:
D�j� vu all over again ... I had a similar request come to me a few weeks ago.

While I share Robin's preference for using the SAS/ETS methods, some situations don't allow for that. Some do not have SAS/ETS licensed. In my case, the requestor specified carrying over the prior value rather than an estimate of the next value in the series.

Even though I assumed that a Data step solution would work best for a physical sequencing problem, my stubborn attempts to try a SQL solution first progressively reduced complexity of the problem to something simple (that is, in SAS SQL with the REMERGE method):

data test;
input date: mmddyy10. ob best.
;
datalines;
4/1/1968 37.7
4/2/1968 37.3
4/3/1968 37.6
4/4/1968 36.95
4/5/1968 37
4/6/1968 .
4/7/1968 .
4/8/1968 37.05
4/9/1968 37.5
4/10/1968 37.7
4/11/1968 38
4/12/1968 .
4/13/1968 .
4/14/1968 .
4/15/1968 .
4/16/1968 38
;
run;

/* Reflexive query solution. SWH */
proc sql;
create table result as
select distinct r1.date,coalesce(r1.ob,r2.ob) as ob
from test as r1 left join (select * from test where NOT ob IS NULL) as r2
on r1.date>=r2.date
group by r1.date
having (r1.date - r2.date) = min(r1.date - r2.date)
order by date
;
quit;

The first constraint limits selections of r1.ob or r2.ob to tuples r1,r2 that have r1.date >= r2.date;
the second constraint limits selections of r2.ob to the tuple with r2.date closest to r1.date;
r2.ob does not have missing values in the yield of the subquery;
the GROUP BY clause looks only at groups of r1,r2 tuples with the same r1.date;
the COALESCE() function selects a value from r2.ob instead of a missing value in r1.ob, else r1.ob.

Call me Spock if you must: I can't resist the logical solution.
S
-----Original Message-----
From: SAS(r) Discussion [mailto:SAS-L(a)LISTSERV.UGA.EDU] On Behalf Of Sierra Information Services
Sent: Wednesday, November 11, 2009 2:02 PM
To: SAS-L(a)LISTSERV.UGA.EDU
Subject: LOCF (Last Obs Carried Forward) Help, Please

Could someone please help me with the following LOCF problem? I am
sure it is something simple, but it eludes me at present.

I have a time series (see sample below) where some values of the
series are missing.

What I want to do is replace the missing values in the series with the
last NON-MISSING Value

Here is what I HAVE:
4/1/1968 37.7
4/2/1968 37.3
4/3/1968 37.6
4/4/1968 36.95
4/5/1968 37
4/6/1968 .
4/7/1968 .
4/8/1968 37.05
4/9/1968 37.5
4/10/1968 37.7
4/11/1968 38
4/12/1968 .
4/13/1968 .
4/14/1968 .
4/15/1968 .
4/16/1968 38

What I WANT is to replace the missing values for 4/6/1968 and 4/7/1968
with 37 and the missing values
for 4/12 through 4/15 with 38

The data set from which I am working has about 30 years worth of daily
data in it.

Any suggestions?

Thanks!

Andrew Karp
Sierra Information Services
www.SierraInformation.com