From: Luna Moon on
Hi all,

Here are the data I am working on:

dDataTmp1 and dDataTmp2 are two Nx2 matrices, with the first column
being the timestamps
and the second column the data. For example,

K>> datestr(dDataTmp2(1, 1))

ans =

01-Apr-2010 00:19:00

They have two properties:

1. They are gigantic.
2. They are not equally spaced in time, i.e. they are event driven
data, not synchronized in time.

I would like to align them in time.

Any good ideas about how to align them in time efficiently? I have
Gigabytes of such data to process.

On the other hand, I am not clear about the notion of "align in time".
Is that just a "union" of the time-stamps?

Any better notion?

Thank you!
From: dpb on
Luna Moon wrote:
....

> On the other hand, I am not clear about the notion of "align in time".
> Is that just a "union" of the time-stamps?
>
> Any better notion?

Well, if you don't have an idea of what you want/need; how are we to
know w/ nary a hint of what you need/want to do... :(

An alternative would be to interpolate one (or both) time series to a
common base. In base product, interp1() would be a choice; if happen to
have the Signal Processing toolbox, resample() would be another...

--
From: Walter Roberson on
Luna Moon wrote:

> On the other hand, I am not clear about the notion of "align in time".
> Is that just a "union" of the time-stamps?

Possibly, if for each timestamp that appears in only one of the two, you
interpolate data for the other.

Alternately, a single time-base could be used, such as at regular
intervals, interpolating for that time-base from the data in both.

Or sometimes it is more appropriate for the interpolation to be based
only on data from the matrix being interpolated, if they represent
different things.
From: Luna Moon on
On Jun 1, 8:49 am, Walter Roberson <rober...(a)hushmail.com> wrote:
> Luna Moon wrote:
> > On the other hand, I am not clear about the notion of "align in time".
> > Is that just a "union" of the time-stamps?
>
> Possibly, if for each timestamp that appears in only one of the two, you
> interpolate data for the other.
>
> Alternately, a single time-base could be used, such as at regular
> intervals, interpolating for that time-base from the data in both.
>
> Or sometimes it is more appropriate for the interpolation to be based
> only on data from the matrix being interpolated, if they represent
> different things.

How do I do the following:

1. Take the intersection of the two timestamps arrays
2. Figure out the index of the intersected timestamp in the date
stream1
3. Figure out the index of the intersected timestamp in the date
steam2
4. Only sift out the data corresponding to those timestamps

How to do these steps extremely fast?
There ought to be a vectorized way of doing 2 and 3?
Thank you!
From: Luna Moon on
On Jun 1, 10:34 am, Luna Moon <lunamoonm...(a)gmail.com> wrote:
> On Jun 1, 8:49 am, Walter Roberson <rober...(a)hushmail.com> wrote:
>
> > Luna Moon wrote:
> > > On the other hand, I am not clear about the notion of "align in time"..
> > > Is that just a "union" of the time-stamps?
>
> > Possibly, if for each timestamp that appears in only one of the two, you
> > interpolate data for the other.
>
> > Alternately, a single time-base could be used, such as at regular
> > intervals, interpolating for that time-base from the data in both.
>
> > Or sometimes it is more appropriate for the interpolation to be based
> > only on data from the matrix being interpolated, if they represent
> > different things.
>
> How do I do the following:
>
> 1. Take the intersection of the two timestamps arrays
> 2. Figure out the index of the intersected timestamp in the date
> stream1
> 3. Figure out the index of the intersected timestamp in the date
> steam2
> 4. Only sift out the data corresponding to those timestamps
>
> How to do these steps extremely fast?
> There ought to be a vectorized way of doing 2 and 3?
> Thank you!

Currently I am doing the above using a "for" loop:

nTimeTmp=intersect(dDataTmp1(:, 1), dDataTmp2(:, 1));

dData1=zeros(size(nTimeTmp));
dData2=zeros(size(nTimeTmp));

for i=1:length(nTimeTmp)

nIdx=find(nTimeTmp(i)==dDataTmp1(:, 1));

dData1(i)=nDataTmp1(nIdx, 2);

nIdx=find(nTimeTmp(i)==dDataTmp2(:, 1));

dData2(i)=nDataTmp2(nIdx, 2);

end;

---------------------

Any good way of speeding it up using vectorized operations?

Thank you!