From: Catalin Eberhardt on
Hi everyone,

I have a bunch of tab-separated data files produced by a script that acquires data in a Psychology experiment. I would like to be able to access the numerical data in each file easily, and the first step is probably to extract a big matrix from each file.

Unfortunately, the script created rather sloppy data files, (lots of text descriptions, etc), so a bit of "filtering" has to be done before arriving at a "clean" matrix (i.e. only numerical data, with maybe a header for each column). A sample data file is located here: http://www.2shared.com/file/12478975/d6307cec/data.html (use the "Save file to your PC" link)

Can anyone suggest what would be an easy way of doing this?

I guess in order the automate the process of data retrieval, the function 'dlmread' would have to be used in a script, and to test how it would work for a single file, I used the Import Data wizard; the wizard automatically (and correctly) detects 5 lines of text header, but then proceeds to create a Data variable (matrix) that only contains 1 row.

Any help would be appreciated, many thanks in advance.
From: us on
"Catalin Eberhardt" <longtalker(a)gmail.com> wrote in message <hpk50l$kub$1(a)fred.mathworks.com>...
> Hi everyone,
>
> I have a bunch of tab-separated data files produced by a script that acquires data in a Psychology experiment. I would like to be able to access the numerical data in each file easily, and the first step is probably to extract a big matrix from each file.
>
> Unfortunately, the script created rather sloppy data files, (lots of text descriptions, etc), so a bit of "filtering" has to be done before arriving at a "clean" matrix (i.e. only numerical data, with maybe a header for each column). A sample data file is located here: http://www.2shared.com/file/12478975/d6307cec/data.html (use the "Save file to your PC" link)
>
> Can anyone suggest what would be an easy way of doing this?
>
> I guess in order the automate the process of data retrieval, the function 'dlmread' would have to be used in a script, and to test how it would work for a single file, I used the Import Data wizard; the wizard automatically (and correctly) detects 5 lines of text header, but then proceeds to create a Data variable (matrix) that only contains 1 row.
>
> Any help would be appreciated, many thanks in advance.

due to firewall restrictions we cannot download the file from the site...

us
From: Catalin Eberhardt on
I uploaded it here too: http://longtalker.20x.cc/data.csv

If that doesn't work either, could you please suggest another way of uploading it that will work even with a firewall, thanks.
From: Catalin Eberhardt on
OK I've made some developments on my own in the meantime, I tried using the built-in function textscan but that function only allows one use of the CommentStyle parameter, therefore I cannot define several text lines for it to ignore.

I also tried using txt2mat, from the File Exchange, but that doesn't produce the expected result, i.e. it also ignores numerical lines in the data file, aside from the text lines.
From: Andres on
"Catalin Eberhardt" <longtalker(a)gmail.com> wrote in message <hpkgm9$5ve$1(a)fred.mathworks.com>...
> OK I've made some developments on my own in the meantime, I tried using the built-in function textscan but that function only allows one use of the CommentStyle parameter, therefore I cannot define several text lines for it to ignore.
>
> I also tried using txt2mat, from the File Exchange, but that doesn't produce the expected result, i.e. it also ignores numerical lines in the data file, aside from the text lines.

Hi Catalin,
I've had a look at your file in the meantime. It has only 5kB, and a quite regular structure of mostly alternating descriptive lines and numerical data lines. If you are really interested in the numerical data only, see my reply on the txt2mat file exchange page.

I could imagine however you'd like to get more information out of the file - then I'd recommend to parse the file with some customized code relying e.g. on fgetl and textscan and to store the data in a more appropriate way, e.g. using structs. Ideally that code would be applicable to each of your files...
Good luck!