Prev: Last Call for Papers Reminder (extended): International Conference on Signal Processing and Imaging Engineering ICSPIE 2010
Next: state-space representation
From: Luna Moon on 22 Jul 2010 08:34 Post now also to comp.dsp to see if experts can help us. The bottleneck is the "backfill" part. There must be a "filter" way of doing "backfill" fast? Thanks a lot! On Jul 21, 3:53 pm, Luna Moon <lunamoonm...(a)gmail.com> wrote: > How to align two time series fast? > > Hi all, > > I have two time series, both are in the following format: > > Date Data > 1/1/2010 5.3 > 1/2/2010 4.4 > ... > > Lets label the first time series: MyDates1, MyData1 and the second > time series: MyDates2, MyData2, > > where MyDates1 and MyData1 have the same number of rows and MyDates2 > and MyData2 have the same number of rows, > > and where MyDates1 and MyDates2 are in fact in datenum format. > > The sets MyDates1 and MyDates2 are very different. > > How can I align the time series two to be in line with the time series > one? > > That's to say, we want to modify MyDates2 and MyData2 to make them in > line with MyDates1 and MyData1. > > Actions: > > (1) If a date is in MyDates1 but not in MyDates2, then insert that > date into MyDates2 and put an "NaN" into corresponding location in > MyData2. > > (2) If a date is in MyDates2 but not in MyDates1, then delete that > date from MyDates2 and delete the data in the corresponding location > in MyData2. > > (3) The 2nd time series now may look like the following: > > Date Data > 1/1/2010 NaN > 1/2/2010 NaN > 1/5/2010 2.3 > 1/6/2010 NaN > 1/7/2010 NaN > 1/8/2010 3.1 > ... > > Then we need to backfill the holes ("NaN"s) in this 2nd time series. > > For example, the above data, after backfill, become: > > Date Data > 1/1/2010 NaN > 1/2/2010 NaN > 1/5/2010 2.3 > 1/6/2010 2.3 > 1/7/2010 2.3 > 1/8/2010 3.1 > ... > > Note that the first a few missing values("NaN"s) cannot be > backfilled... > > The output is the modified MyData2, because the modified MyDate2 > should be exactly as the MyDate1 which is used as reference. > > MyData2 should now have the same number of rows as MyDate1, MyData1, > and MyDate2(modified). > > I currently do this using Matlab Financial toolbox, > > but it's very slow, > > Any thought how I can do it fast? > > Thanks a lot!
From: Fred Marshall on 22 Jul 2010 14:45 YoLuna Moon wrote: > Post now also to comp.dsp to see if experts can help us. > > The bottleneck is the "backfill" part. > > There must be a "filter" way of doing "backfill" fast? > > Thanks a lot! > > On Jul 21, 3:53 pm, Luna Moon <lunamoonm...(a)gmail.com> wrote: >> How to align two time series fast? >> >> Hi all, >> >> I have two time series, both are in the following format: >> >> Date Data >> 1/1/2010 5.3 >> 1/2/2010 4.4 >> ... >> >> Lets label the first time series: MyDates1, MyData1 and the second >> time series: MyDates2, MyData2, >> >> where MyDates1 and MyData1 have the same number of rows and MyDates2 >> and MyData2 have the same number of rows, >> >> and where MyDates1 and MyDates2 are in fact in datenum format. >> >> The sets MyDates1 and MyDates2 are very different. >> >> How can I align the time series two to be in line with the time series >> one? >> >> That's to say, we want to modify MyDates2 and MyData2 to make them in >> line with MyDates1 and MyData1. >> >> Actions: >> >> (1) If a date is in MyDates1 but not in MyDates2, then insert that >> date into MyDates2 and put an "NaN" into corresponding location in >> MyData2. >> >> (2) If a date is in MyDates2 but not in MyDates1, then delete that >> date from MyDates2 and delete the data in the corresponding location >> in MyData2. >> >> (3) The 2nd time series now may look like the following: >> >> Date Data >> 1/1/2010 NaN >> 1/2/2010 NaN >> 1/5/2010 2.3 >> 1/6/2010 NaN >> 1/7/2010 NaN >> 1/8/2010 3.1 >> ... >> >> Then we need to backfill the holes ("NaN"s) in this 2nd time series. >> >> For example, the above data, after backfill, become: >> >> Date Data >> 1/1/2010 NaN >> 1/2/2010 NaN >> 1/5/2010 2.3 >> 1/6/2010 2.3 >> 1/7/2010 2.3 >> 1/8/2010 3.1 >> ... >> >> Note that the first a few missing values("NaN"s) cannot be >> backfilled... >> >> The output is the modified MyData2, because the modified MyDate2 >> should be exactly as the MyDate1 which is used as reference. >> >> MyData2 should now have the same number of rows as MyDate1, MyData1, >> and MyDate2(modified). >> >> I currently do this using Matlab Financial toolbox, >> >> but it's very slow, >> >> Any thought how I can do it fast? >> >> Thanks a lot! > Your rules are obscured by the fact that you don't provide 3 or 4 time series. - Input 1 - Input 2 - Output 3 - Output 4 I have no idea what you are trying to do with Action 2... I'd suggest a better naming convention similar to the set above: Series 1 Series 2 Series 3 from Action 1 on Series 1 and Series 2 Series 4 from Action 2 on [what?] Series 2 or Series 3? I don't see the data that ends up at 1/6 and 1/7 so the thing is just a bit fuzzy yet. >> Date Data >> 1/1/2010 NaN >> 1/2/2010 NaN >> 1/5/2010 2.3 >> 1/6/2010 2.3 >> 1/7/2010 2.3 >> 1/8/2010 3.1 Fred
From: Luna Moon on 22 Jul 2010 16:02
On Jul 22, 9:22 am, dpb <n...(a)non.net> wrote: > dpb wrote: > > Luna Moon wrote: > >> On Jul 21, 6:36 pm, dpb <n...(a)non.net> wrote: > >>> Luna Moon wrote: > > >>> ... > > >>>> The key part is how to do back-fill fast! > >>> How big of a series is this and what's the typical sparseness? > > >>> Wouldn't seem it should be particularly time consuming but an example > >>> might help visualize. > > >>> -- > > >> Very large, millions of rows... and I have lots of such time series... > > >> So let's just focus on how to write such backfill function better... > > > How about revising the algorithm? > > > Or, perhaps mex the function you have. > > > I'll consider it overnight; nothing pops to mind automagic... > > OK, what about > > idx = ~isnan(d) & isnan([d(2:end) -1]) > > is logical array of those locations w/ a value followed by Nan > > The next location in the array is to be replaced with the value at this > location. > > Iterate this until idx==0 > > Still iterative but perhaps different than you're currently doing... > > --- Hide quoted text - > > - Show quoted text - no i am doing this exactly the same way... but need to get rid of for loop |