From: Luna Moon on 21 Jul 2010 15:53 How to align two time series fast? Hi all, I have two time series, both are in the following format: Date Data 1/1/2010 5.3 1/2/2010 4.4 .... Lets label the first time series: MyDates1, MyData1 and the second time series: MyDates2, MyData2, where MyDates1 and MyData1 have the same number of rows and MyDates2 and MyData2 have the same number of rows, and where MyDates1 and MyDates2 are in fact in datenum format. The sets MyDates1 and MyDates2 are very different. How can I align the time series two to be in line with the time series one? That's to say, we want to modify MyDates2 and MyData2 to make them in line with MyDates1 and MyData1. Actions: (1) If a date is in MyDates1 but not in MyDates2, then insert that date into MyDates2 and put an "NaN" into corresponding location in MyData2. (2) If a date is in MyDates2 but not in MyDates1, then delete that date from MyDates2 and delete the data in the corresponding location in MyData2. (3) The 2nd time series now may look like the following: Date Data 1/1/2010 NaN 1/2/2010 NaN 1/5/2010 2.3 1/6/2010 NaN 1/7/2010 NaN 1/8/2010 3.1 .... Then we need to backfill the holes ("NaN"s) in this 2nd time series. For example, the above data, after backfill, become: Date Data 1/1/2010 NaN 1/2/2010 NaN 1/5/2010 2.3 1/6/2010 2.3 1/7/2010 2.3 1/8/2010 3.1 .... Note that the first a few missing values("NaN"s) cannot be backfilled... The output is the modified MyData2, because the modified MyDate2 should be exactly as the MyDate1 which is used as reference. MyData2 should now have the same number of rows as MyDate1, MyData1, and MyDate2(modified). I currently do this using Matlab Financial toolbox, but it's very slow, Any thought how I can do it fast? Thanks a lot!
From: dpb on 21 Jul 2010 16:09 Luna Moon wrote: .... > Lets label the first time series: MyDates1, MyData1 and the second > time series: MyDates2, MyData2, .... > and where MyDates1 and MyDates2 are in fact in datenum format. .... > That's to say, we want to modify MyDates2 and MyData2 to make them in > line with MyDates1 and MyData1. > > Actions: > > (1) If a date is in MyDates1 but not in MyDates2, then insert that > date into MyDates2 and put an "NaN" into corresponding location in > MyData2. > > (2) If a date is in MyDates2 but not in MyDates1, then delete that > date from MyDates2 and delete the data in the corresponding location > in MyData2. So, the final time vector is simply T1 and the data is that associated w/ those of T2 that exist. T3 = T1; % new time vector D3=nan*ones(size(D1)); % a vector of nans D3(ismember(T2,T1))=D(ismember(T2,T1)); % the matching values OTOMH, seems like then would have to iterate thru the gaps isnan() will leave and fill in. Probably something clever in that regard but doesn't strike me just now... --
From: Luna Moon on 21 Jul 2010 16:20 On Jul 21, 4:09 pm, dpb <n...(a)non.net> wrote: > Luna Moon wrote: > > ... > > > > > Lets label the first time series: MyDates1, MyData1 and the second > > time series: MyDates2, MyData2, > ... > > and where MyDates1 and MyDates2 are in fact in datenum format. > ... > > That's to say, we want to modify MyDates2 and MyData2 to make them in > > line with MyDates1 and MyData1. > > > Actions: > > > (1) If a date is in MyDates1 but not in MyDates2, then insert that > > date into MyDates2 and put an "NaN" into corresponding location in > > MyData2. > > > (2) If a date is in MyDates2 but not in MyDates1, then delete that > > date from MyDates2 and delete the data in the corresponding location > > in MyData2. > > So, the final time vector is simply T1 and the data is that associated > w/ those of T2 that exist. > > T3 = T1; % new time vector > D3=nan*ones(size(D1)); % a vector of nans > D3(ismember(T2,T1))=D(ismember(T2,T1)); % the matching values > > OTOMH, seems like then would have to iterate thru the gaps isnan() will > leave and fill in. Probably something clever in that regard but doesn't > strike me just now... > > -- yeah, that's why I found it extremely slow... My current implementation does have for loop, which I hate...
From: Luna Moon on 21 Jul 2010 16:20 On Jul 21, 4:09 pm, dpb <n...(a)non.net> wrote: > Luna Moon wrote: > > ... > > > > > Lets label the first time series: MyDates1, MyData1 and the second > > time series: MyDates2, MyData2, > ... > > and where MyDates1 and MyDates2 are in fact in datenum format. > ... > > That's to say, we want to modify MyDates2 and MyData2 to make them in > > line with MyDates1 and MyData1. > > > Actions: > > > (1) If a date is in MyDates1 but not in MyDates2, then insert that > > date into MyDates2 and put an "NaN" into corresponding location in > > MyData2. > > > (2) If a date is in MyDates2 but not in MyDates1, then delete that > > date from MyDates2 and delete the data in the corresponding location > > in MyData2. > > So, the final time vector is simply T1 and the data is that associated > w/ those of T2 that exist. > > T3 = T1; % new time vector > D3=nan*ones(size(D1)); % a vector of nans > D3(ismember(T2,T1))=D(ismember(T2,T1)); % the matching values > > OTOMH, seems like then would have to iterate thru the gaps isnan() will > leave and fill in. Probably something clever in that regard but doesn't > strike me just now... > > -- The key part is how to do back-fill fast!
From: dpb on 21 Jul 2010 18:36
Luna Moon wrote: .... > The key part is how to do back-fill fast! How big of a series is this and what's the typical sparseness? Wouldn't seem it should be particularly time consuming but an example might help visualize. -- |