From: Luna Moon on 21 Jul 2010 20:46 On Jul 21, 6:36 pm, dpb <n...(a)non.net> wrote: > Luna Moon wrote: > > ... > > > The key part is how to do back-fill fast! > > How big of a series is this and what's the typical sparseness? > > Wouldn't seem it should be particularly time consuming but an example > might help visualize. > > -- Very large, millions of rows... and I have lots of such time series... So let's just focus on how to write such backfill function better...
From: dpb on 22 Jul 2010 00:24 Luna Moon wrote: > On Jul 21, 6:36 pm, dpb <n...(a)non.net> wrote: >> Luna Moon wrote: >> >> ... >> >>> The key part is how to do back-fill fast! >> How big of a series is this and what's the typical sparseness? >> >> Wouldn't seem it should be particularly time consuming but an example >> might help visualize. >> >> -- > > Very large, millions of rows... and I have lots of such time series... > > So let's just focus on how to write such backfill function better... How about revising the algorithm? Or, perhaps mex the function you have. I'll consider it overnight; nothing pops to mind automagic... --
From: Luna Moon on 22 Jul 2010 08:34 Post now also to comp.dsp to see if experts can help us. The bottleneck is the "backfill" part. There must be a "filter" way of doing "backfill" fast? Thanks a lot! On Jul 21, 3:53 pm, Luna Moon <lunamoonm...(a)gmail.com> wrote: > How to align two time series fast? > > Hi all, > > I have two time series, both are in the following format: > > Date Data > 1/1/2010 5.3 > 1/2/2010 4.4 > ... > > Lets label the first time series: MyDates1, MyData1 and the second > time series: MyDates2, MyData2, > > where MyDates1 and MyData1 have the same number of rows and MyDates2 > and MyData2 have the same number of rows, > > and where MyDates1 and MyDates2 are in fact in datenum format. > > The sets MyDates1 and MyDates2 are very different. > > How can I align the time series two to be in line with the time series > one? > > That's to say, we want to modify MyDates2 and MyData2 to make them in > line with MyDates1 and MyData1. > > Actions: > > (1) If a date is in MyDates1 but not in MyDates2, then insert that > date into MyDates2 and put an "NaN" into corresponding location in > MyData2. > > (2) If a date is in MyDates2 but not in MyDates1, then delete that > date from MyDates2 and delete the data in the corresponding location > in MyData2. > > (3) The 2nd time series now may look like the following: > > Date Data > 1/1/2010 NaN > 1/2/2010 NaN > 1/5/2010 2.3 > 1/6/2010 NaN > 1/7/2010 NaN > 1/8/2010 3.1 > ... > > Then we need to backfill the holes ("NaN"s) in this 2nd time series. > > For example, the above data, after backfill, become: > > Date Data > 1/1/2010 NaN > 1/2/2010 NaN > 1/5/2010 2.3 > 1/6/2010 2.3 > 1/7/2010 2.3 > 1/8/2010 3.1 > ... > > Note that the first a few missing values("NaN"s) cannot be > backfilled... > > The output is the modified MyData2, because the modified MyDate2 > should be exactly as the MyDate1 which is used as reference. > > MyData2 should now have the same number of rows as MyDate1, MyData1, > and MyDate2(modified). > > I currently do this using Matlab Financial toolbox, > > but it's very slow, > > Any thought how I can do it fast? > > Thanks a lot!
From: Steve Amphlett on 22 Jul 2010 09:24 Luna Moon <lunamoonmoon(a)gmail.com> wrote in message <be8c1193-d5a2-445f-8c88-02248117568b(a)e5g2000yqn.googlegroups.com>... > > There must be a "filter" way of doing "backfill" fast? A MEX would be trivial. Here is a more traditional ML approach. It doesn't do the ends properly though, this is left as an exercise for the OP. x=[1;1;2;1;2;3;NaN;NaN;3;2;NaN;NaN;NaN;4;1;5;2;NaN;3]; y=zeros(size(x)); idx=isnan(x); idx1=find(diff(idx)>0); idx2=find(diff(idx)<0); y(idx1+1)=x(idx1); y(idx2+1)=-x(idx1); y=cumsum(y); y(~idx)=x(~idx); [x z]
From: dpb on 22 Jul 2010 09:22 dpb wrote: > Luna Moon wrote: >> On Jul 21, 6:36 pm, dpb <n...(a)non.net> wrote: >>> Luna Moon wrote: >>> >>> ... >>> >>>> The key part is how to do back-fill fast! >>> How big of a series is this and what's the typical sparseness? >>> >>> Wouldn't seem it should be particularly time consuming but an example >>> might help visualize. >>> >>> -- >> >> Very large, millions of rows... and I have lots of such time series... >> >> So let's just focus on how to write such backfill function better... > > How about revising the algorithm? > > Or, perhaps mex the function you have. > > I'll consider it overnight; nothing pops to mind automagic... OK, what about idx = ~isnan(d) & isnan([d(2:end) -1]) is logical array of those locations w/ a value followed by Nan The next location in the array is to be replaced with the value at this location. Iterate this until idx==0 Still iterative but perhaps different than you're currently doing... --
First
|
Prev
|
Next
|
Last
Pages: 1 2 3 4 5 Prev: Excel/Matlab Cooperation Next: relational operator tolerences |