How to align two time series fast? [Matlab]

Prev: Excel/Matlab Cooperation
Next: relational operator tolerences

From: Steve Amphlett on 22 Jul 2010 09:52

"Steve Amphlett" <Firstname.Lastname(a)Where-I-Work.com> wrote in message <i29gpk$20e$1(a)fred.mathworks.com>...
> Luna Moon <lunamoonmoon(a)gmail.com> wrote in message <be8c1193-d5a2-445f-8c88-02248117568b(a)e5g2000yqn.googlegroups.com>...
> >
> > There must be a "filter" way of doing "backfill" fast?
>
> A MEX would be trivial. Here is a more traditional ML approach. It doesn't do the ends properly though, this is left as an exercise for the OP.
>
> x=[1;1;2;1;2;3;NaN;NaN;3;2;NaN;NaN;NaN;4;1;5;2;NaN;3];
>
> y=zeros(size(x));
> idx=isnan(x);
>
> idx1=find(diff(idx)>0);
> idx2=find(diff(idx)<0);
>
> y(idx1+1)=x(idx1);
> y(idx2+1)=-x(idx1);
> y=cumsum(y);
> y(~idx)=x(~idx);
>
> [x z]

Last line is, of course:

[x y]

From: Fred Marshall on 22 Jul 2010 14:45

YoLuna Moon wrote:
> Post now also to comp.dsp to see if experts can help us.
>
> The bottleneck is the "backfill" part.
>
> There must be a "filter" way of doing "backfill" fast?
>
> Thanks a lot!
>
> On Jul 21, 3:53 pm, Luna Moon <lunamoonm...(a)gmail.com> wrote:
>> How to align two time series fast?
>>
>> Hi all,
>>
>> I have two time series, both are in the following format:
>>
>> Date Data
>> 1/1/2010 5.3
>> 1/2/2010 4.4
>> ...
>>
>> Lets label the first time series: MyDates1, MyData1 and the second
>> time series: MyDates2, MyData2,
>>
>> where MyDates1 and MyData1 have the same number of rows and MyDates2
>> and MyData2 have the same number of rows,
>>
>> and where MyDates1 and MyDates2 are in fact in datenum format.
>>
>> The sets MyDates1 and MyDates2 are very different.
>>
>> How can I align the time series two to be in line with the time series
>> one?
>>
>> That's to say, we want to modify MyDates2 and MyData2 to make them in
>> line with MyDates1 and MyData1.
>>
>> Actions:
>>
>> (1) If a date is in MyDates1 but not in MyDates2, then insert that
>> date into MyDates2 and put an "NaN" into corresponding location in
>> MyData2.
>>
>> (2) If a date is in MyDates2 but not in MyDates1, then delete that
>> date from MyDates2 and delete the data in the corresponding location
>> in MyData2.
>>
>> (3) The 2nd time series now may look like the following:
>>
>> Date Data
>> 1/1/2010 NaN
>> 1/2/2010 NaN
>> 1/5/2010 2.3
>> 1/6/2010 NaN
>> 1/7/2010 NaN
>> 1/8/2010 3.1
>> ...
>>
>> Then we need to backfill the holes ("NaN"s) in this 2nd time series.
>>
>> For example, the above data, after backfill, become:
>>
>> Date Data
>> 1/1/2010 NaN
>> 1/2/2010 NaN
>> 1/5/2010 2.3
>> 1/6/2010 2.3
>> 1/7/2010 2.3
>> 1/8/2010 3.1
>> ...
>>
>> Note that the first a few missing values("NaN"s) cannot be
>> backfilled...
>>
>> The output is the modified MyData2, because the modified MyDate2
>> should be exactly as the MyDate1 which is used as reference.
>>
>> MyData2 should now have the same number of rows as MyDate1, MyData1,
>> and MyDate2(modified).
>>
>> I currently do this using Matlab Financial toolbox,
>>
>> but it's very slow,
>>
>> Any thought how I can do it fast?
>>
>> Thanks a lot!
>

Your rules are obscured by the fact that you don't provide 3 or 4 time
series.
- Input 1
- Input 2
- Output 3
- Output 4

I have no idea what you are trying to do with Action 2...
I'd suggest a better naming convention similar to the set above:

Series 1
Series 2
Series 3 from Action 1 on Series 1 and Series 2
Series 4 from Action 2 on [what?] Series 2 or Series 3?

I don't see the data that ends up at 1/6 and 1/7 so the thing is just a
bit fuzzy yet.

>> Date Data
>> 1/1/2010 NaN
>> 1/2/2010 NaN
>> 1/5/2010 2.3
>> 1/6/2010 2.3
>> 1/7/2010 2.3
>> 1/8/2010 3.1

Fred

From: Luna Moon on 22 Jul 2010 16:02

On Jul 22, 9:22 am, dpb <n...(a)non.net> wrote:
> dpb wrote:
> > Luna Moon wrote:
> >> On Jul 21, 6:36 pm, dpb <n...(a)non.net> wrote:
> >>> Luna Moon wrote:
>
> >>> ...
>
> >>>> The key part is how to do back-fill fast!
> >>> How big of a series is this and what's the typical sparseness?
>
> >>> Wouldn't seem it should be particularly time consuming but an example
> >>> might help visualize.
>
> >>> --
>
> >> Very large, millions of rows... and I have lots of such time series...
>
> >> So let's just focus on how to write such backfill function better...
>
> > How about revising the algorithm?
>
> > Or, perhaps mex the function you have.
>
> > I'll consider it overnight; nothing pops to mind automagic...
>
> OK, what about
>
> idx = ~isnan(d) & isnan([d(2:end) -1])
>
> is logical array of those locations w/ a value followed by Nan
>
> The next location in the array is to be replaced with the value at this
> location.
>
> Iterate this until idx==0
>
> Still iterative but perhaps different than you're currently doing...
>
> --- Hide quoted text -
>
> - Show quoted text -

no i am doing this exactly the same way...

but need to get rid of for loop

From: Luna Moon on 22 Jul 2010 16:07

On Jul 22, 9:24 am, "Steve Amphlett" <Firstname.Lastn...(a)Where-I-
Work.com> wrote:
> Luna Moon <lunamoonm...(a)gmail.com> wrote in message <be8c1193-d5a2-445f-8c88-022481175...(a)e5g2000yqn.googlegroups.com>...
>
> > There must be a "filter" way of doing "backfill" fast?
>
> A MEX would be trivial. Here is a more traditional ML approach. It doesn't do the ends properly though, this is left as an exercise for the OP..
>
> x=[1;1;2;1;2;3;NaN;NaN;3;2;NaN;NaN;NaN;4;1;5;2;NaN;3];
>
> y=zeros(size(x));
> idx=isnan(x);
>
> idx1=find(diff(idx)>0);
> idx2=find(diff(idx)<0);
>
> y(idx1+1)=x(idx1);
> y(idx2+1)=-x(idx1);
> y=cumsum(y);
> y(~idx)=x(~idx);
>
> [x z]

very cool.

but doesn't work on:

x=[NaN; NaN; 1;1;2;1;2;3;NaN;NaN;3;2;NaN;NaN;NaN;4;1;5;2;NaN;3];

please note I added the initial 2 "NaN"s to test your program,

it broke

the initial two "NaN"s need to remain there because there is no way to
backfill these initial "NaN"s...

Thank you!

From: Doug Weathers on 22 Jul 2010 16:33

Luna Moon <lunamoonmoon(a)gmail.com> wrote in message <485e09eb-631f-4835-b91c-07549fea431d(a)u26g2000yqu.googlegroups.com>...

> the initial two "NaN"s need to remain there because there is no way to
> backfill these initial "NaN"s...

Could you lop them off, run the given routine, then put them back?

>
> Thank you!

First | Prev | Next | Last
Pages: 1 2 3 4 5
Prev: Excel/Matlab Cooperation
Next: relational operator tolerences