From: Luna Moon on
Hi all,

In my project I need to run query on tabulated data. The common things
are:

1. The data is of gigantic size (GB) (I put them into trunks,
processing one at a time).
2. The data (after sorting) should be indexed and ordered with date/
time stamps (with milli-second precision).
3. There can be multiple data rows corresponding to the same date/time
stamps.
4. I need to run lots of query using date/time stamps as index, e.g.
time-windowed operations, etc. and need to be super fast...

How do I do these in Matlab?

Thanks a lot!
From: Oleg Komarov on
Luna Moon <lunamoonmoon(a)gmail.com> wrote in message <e910149a-f47e-4078-b6a9-7ced3e0ed040(a)r27g2000yqb.googlegroups.com>...
> Hi all,
>
> In my project I need to run query on tabulated data. The common things
> are:
>
> 1. The data is of gigantic size (GB) (I put them into trunks,
> processing one at a time).
> 2. The data (after sorting) should be indexed and ordered with date/
> time stamps (with milli-second precision).
> 3. There can be multiple data rows corresponding to the same date/time
> stamps.
> 4. I need to run lots of query using date/time stamps as index, e.g.
> time-windowed operations, etc. and need to be super fast...
>
> How do I do these in Matlab?
>
> Thanks a lot!

1. Use textscan or dlmread to import data
2. Use serial dates: datenum.
3. Not much to say until we see what you need to do

Oleg
From: dpb on
Luna Moon wrote:
> Hi all,
>
> In my project I need to run query on tabulated data. The common things
> are:
>
> 1. The data is of gigantic size (GB) (I put them into trunks,
> processing one at a time).
> 2. The data (after sorting) should be indexed and ordered with date/
> time stamps (with milli-second precision).
> 3. There can be multiple data rows corresponding to the same date/time
> stamps.
> 4. I need to run lots of query using date/time stamps as index, e.g.
> time-windowed operations, etc. and need to be super fast...
>
> How do I do these in Matlab?

Seems like we've been here before...

Bruno (I believe it was?) demonstrated not long ago how to generate
begin:end datenums from date stamps and various manipulations therewith.

It would seem that writing a set of functions to return the rows in
question would be the logical interface to build for your purposes.

If files are truly huge, perhaps incorporating hashing keys or other
techniques might help in segregating sections of data over which it's
known the data either can't be or must reside. Alternatively, perhaps
some of the operations such as sorting could be performed outside ML and
the data loaded as presorted, keyed, etc.

If it is more "database-y" type queries than simply dates as outlined
above, then something more might be worthwhile but no data provided...

--
 | 
Pages: 1
Prev: Smart cache needed...
Next: 3D-FDTD