From: Matt on
Hello,
I'm working with large time series data input files (.txt, tab delimited) and so far have used a couple approaches: dataset and textscan.

Using

dataset('file',dataset_file,'delimiter','\t');

is a bit too slow with larger files. Alternatively I can quickly make a cell array of the dataset using textscan for each row and then parsing that with

data_row_parts=regexp(data_row,'\t','split');

My problem is that creates a cell array that I'm having a heck of a time figuring out how to convert to numeric to perform calculations on such as column averages, scaled columns, etc. I have removed the header and the timestamp column in my original data set, so the data is all numeric (or NaN). What I'd like to do is create a numeric matrix of double values from my cell array but I haven't found a way to make it work.

Thanks for you help
From: Walter Roberson on
Matt wrote:

> data_row_parts=regexp(data_row,'\t','split');
>
> My problem is that creates a cell array that I'm having a heck of a time
> figuring out how to convert to numeric to perform calculations on such
> as column averages, scaled columns, etc. I have removed the header and
> the timestamp column in my original data set, so the data is all numeric
> (or NaN). What I'd like to do is create a numeric matrix of double
> values from my cell array but I haven't found a way to make it work.

str2double() works on a cell array of strings, one number per string.

If you have cell arrays of cell arrays, then you can use cellfun(@str2double....)
From: Matt on
Ah ha...that does work. Unfortunately performing that on my cell array takes about the same amount of time as

dataset('file',dataset_file,'delimiter','\t');

Thanks for helping out a beginner