From: Ayotunde on
I've tried doing a search but not finding anything that can help me. I'm trying to load several csv files into matlab ranging from 20mb to 2 gb as they contain data from the S&P index for a project i have to do. The data is in this format but if i try to import into matlab using the import wizard, i get an m by 1 matrix which isnt really what i need. sample of 1 of the csv is below
SYMBOL DATE TIME PRICE SIZE G127 CORR COND EX
SPY 19980102 09:31:41 97.3125 53500 0 0 A
SPY 19980102 09:31:43 97.3125 100 0 0 0 M
SPY 19980102 09:31:43 97.3125 200 0 0 0 M
SPY 19980102 09:31:44 97.3125 4800 0 0 0 P
i guess i have to get rid of some of the columns i dont need as i need to create 5-minute observations for a year of data (each csv) but i am new to matlab so dont know if there is a way of doing this before the data is imported. My lecturer said something about tick-by-tick conversion but i have no idea what that is.
help would be really appreciated
From: AJ on
"Ayotunde " <rhymer2k(a)yahoo.co.uk> wrote in message <hsc7vc$cuh$1(a)fred.mathworks.com>...
> I've tried doing a search but not finding anything that can help me. I'm trying to load several csv files into matlab ranging from 20mb to 2 gb as they contain data from the S&P index for a project i have to do. The data is in this format but if i try to import into matlab using the import wizard, i get an m by 1 matrix which isnt really what i need. sample of 1 of the csv is below
> SYMBOL DATE TIME PRICE SIZE G127 CORR COND EX
> SPY 19980102 09:31:41 97.3125 53500 0 0 A
> SPY 19980102 09:31:43 97.3125 100 0 0 0 M
> SPY 19980102 09:31:43 97.3125 200 0 0 0 M
> SPY 19980102 09:31:44 97.3125 4800 0 0 0 P
> i guess i have to get rid of some of the columns i dont need as i need to create 5-minute observations for a year of data (each csv) but i am new to matlab so dont know if there is a way of doing this before the data is imported. My lecturer said something about tick-by-tick conversion but i have no idea what that is.
> help would be really appreciated


You can use the xlsread function to parse that matrix in to numeric and text data. I copied your example in to a CSV file and used the following command:

[num,txt] = xlsread('Book1.csv');

This resulted in two variables, num and txt, that allow you to work with them independently. This does take a little bit of keeping track because the sizes of the variables are different. num is a 4x7 double and txt is a 5x9 cell. It's up to you at that point to keep track of how things should line up between the two. In the past I've played games with the num matrix to add rows and columns of zeros or ones to make the dimension the same size as my original. It's considerably easier to keep track of that way.

A word of caution though ... xlsread isn't the fastest I/O function and is intended for Excel files. csvread would be faster but I've never tried that command with mixed character and numeric files. You could look in to that and see what happens.

Hope that helps. Good luck!
AJ
From: TideMan on
On May 12, 7:12 am, "AJ " <aj_ol...(a)hawkerbeechcraftREMOVE.THIScom>
wrote:
> "Ayotunde " <rhyme...(a)yahoo.co.uk> wrote in message <hsc7vc$cu...(a)fred.mathworks.com>...
> > I've tried doing a search but not finding anything that can help me. I'm trying to load several csv files into matlab ranging from 20mb to 2 gb as they contain data from the S&P index for a project i have to do. The data is in this format but if i try to import into matlab using the import wizard, i get an m by 1 matrix which isnt really what i need. sample of 1 of the csv is below
> > SYMBOL     DATE             TIME   PRICE   SIZE         G127       CORR   COND     EX
> > SPY            19980102    09:31:41        97.3125 53500       0      0                    A
> > SPY            19980102    09:31:43        97.3125 100     0           0     0                     M
> > SPY            19980102    09:31:43        97.3125 200     0           0     0                     M
> > SPY            19980102    09:31:44        97.3125 4800    0           0     0                     P
> >  i guess i have to get rid of some of the columns i dont need as i need to create 5-minute observations for a year of data (each csv) but i am new to matlab so dont know if there is a way of doing this before the data is imported. My lecturer said something about tick-by-tick conversion but i have no idea what that is.
> > help would be really appreciated
>
> You can use the xlsread function to parse that matrix in to numeric and text data.  I copied your example in to a CSV file and used the following command:
>
> [num,txt] = xlsread('Book1.csv');
>
> This resulted in two variables, num and txt, that allow you to work with them independently.  This does take a little bit of keeping track because the sizes of the variables are different.  num is a 4x7 double and txt is a 5x9 cell.  It's up to you at that point to keep track of how things should line up between the two.  In the past I've played games with the num matrix to add rows and columns of zeros or ones to make the dimension the same size as my original.  It's considerably easier to keep track of that way.
>
> A word of caution though ...  xlsread isn't the fastest I/O function and is intended for Excel files.  csvread would be faster but I've never tried that command with mixed character and numeric files.  You could look in to that and see what happens.
>
> Hope that helps.  Good luck!
> AJ

No, csvread won't work. It only works on numeric data.
And xlsread is a long-winded way to do it.

But I'm puzzled.....
What you showed is not a normal .csv file. It should look like this:
SYMBOL,DATE,TIME,PRICE,SIZE,G127,CORR,COND,EX
SPY,19980102,09:31:41,97.3125,53500,0,0,A
In other words, there should be commas between each field - that's
what csv stands for: "comma separated values".

What are the delimiters in your case? Are they spaces or tabs?
In either case, you should invest time in finding out how textscan
works.
It will do the job for you and it will be very fast. There's a bit of
learning, but it will be worth it.
help textscan

From: Ayotunde on
TideMan <mulgor(a)gmail.com> wrote in message <173ffafa-b3ab-4a93-8774-919731a0f7d3(a)s13g2000prc.googlegroups.com>...
> On May 12, 7:12 am, "AJ " <aj_ol...(a)hawkerbeechcraftREMOVE.THIScom>
> wrote:
> > "Ayotunde " <rhyme...(a)yahoo.co.uk> wrote in message <hsc7vc$cu...(a)fred.mathworks.com>...
> > > I've tried doing a search but not finding anything that can help me. I'm trying to load several csv files into matlab ranging from 20mb to 2 gb as they contain data from the S&P index for a project i have to do. The data is in this format but if i try to import into matlab using the import wizard, i get an m by 1 matrix which isnt really what i need. sample of 1 of the csv is below
> > > SYMBOL     DATE             TIME   PRICE   SIZE         G127       CORR   COND     EX
> > > SPY            19980102    09:31:41        97.3125 53500       0      0                    A
> > > SPY            19980102    09:31:43        97.3125 100     0           0     0                     M
> > > SPY            19980102    09:31:43        97.3125 200     0           0     0                     M
> > > SPY            19980102    09:31:44        97.3125 4800    0           0     0                     P
> > >  i guess i have to get rid of some of the columns i dont need as i need to create 5-minute observations for a year of data (each csv) but i am new to matlab so dont know if there is a way of doing this before the data is imported. My lecturer said something about tick-by-tick conversion but i have no idea what that is.
> > > help would be really appreciated
> >
> > You can use the xlsread function to parse that matrix in to numeric and text data.  I copied your example in to a CSV file and used the following command:
> >
> > [num,txt] = xlsread('Book1.csv');
> >
> > This resulted in two variables, num and txt, that allow you to work with them independently.  This does take a little bit of keeping track because the sizes of the variables are different.  num is a 4x7 double and txt is a 5x9 cell.  It's up to you at that point to keep track of how things should line up between the two.  In the past I've played games with the num matrix to add rows and columns of zeros or ones to make the dimension the same size as my original.  It's considerably easier to keep track of that way.
> >
> > A word of caution though ...  xlsread isn't the fastest I/O function and is intended for Excel files.  csvread would be faster but I've never tried that command with mixed character and numeric files.  You could look in to that and see what happens.
> >
> > Hope that helps.  Good luck!
> > AJ
>
> No, csvread won't work. It only works on numeric data.
> And xlsread is a long-winded way to do it.
>
> But I'm puzzled.....
> What you showed is not a normal .csv file. It should look like this:
> SYMBOL,DATE,TIME,PRICE,SIZE,G127,CORR,COND,EX
> SPY,19980102,09:31:41,97.3125,53500,0,0,A
> In other words, there should be commas between each field - that's
> what csv stands for: "comma separated values".
>
> What are the delimiters in your case? Are they spaces or tabs?
> In either case, you should invest time in finding out how textscan
> works.
> It will do the job for you and it will be very fast. There's a bit of
> learning, but it will be worth it.
> help textscan

Hello Tideman, you are correct it does look like that but i copied how it looked like in excel in my original post. yes my delimiters are commas but when i try importing on csv for example using the import wizard it gives me a 360579 by 1 matrix which i know is not how it should look like. ive tried removing the columns i dont need in excel before importing and this seems to be better but then i need a way of converting the date format to something matlab can read. Also the time too as i need 5 minute obvservations for my project.I also tried changing the csv to text but i need to get rid of the non-numerical parts like the DATE, TIME etc) will that work? I tried what the other person said (xlsread) but i am getting a failure error about not being able to open excel server or something like that.
Thanks for your continued help people and sorry in advance about my english, it is my first language but i just hate using matlab soo much it is distressing me
From: TideMan on
On May 12, 8:34 am, "Ayotunde " <rhyme...(a)yahoo.co.uk> wrote:
> TideMan <mul...(a)gmail.com> wrote in message <173ffafa-b3ab-4a93-8774-919731a0f...(a)s13g2000prc.googlegroups.com>...
> > On May 12, 7:12 am, "AJ " <aj_ol...(a)hawkerbeechcraftREMOVE.THIScom>
> > wrote:
> > > "Ayotunde " <rhyme...(a)yahoo.co.uk> wrote in message <hsc7vc$cu...(a)fred.mathworks.com>...
> > > > I've tried doing a search but not finding anything that can help me.. I'm trying to load several csv files into matlab ranging from 20mb to 2 gb as they contain data from the S&P index for a project i have to do. The data is in this format but if i try to import into matlab using the import wizard, i get an m by 1 matrix which isnt really what i need. sample of 1 of the csv is below
> > > > SYMBOL     DATE             TIME   PRICE   SIZE         G127       CORR   COND     EX
> > > > SPY            19980102    09:31:41        97.3125 53500       0      0                    A
> > > > SPY            19980102    09:31:43        97.3125 100     0           0     0                     M
> > > > SPY            19980102    09:31:43        97.3125 200     0           0     0                     M
> > > > SPY            19980102    09:31:44        97.3125 4800    0           0     0                     P
> > > >  i guess i have to get rid of some of the columns i dont need as i need to create 5-minute observations for a year of data (each csv) but i am new to matlab so dont know if there is a way of doing this before the data is imported. My lecturer said something about tick-by-tick conversion but i have no idea what that is.
> > > > help would be really appreciated
>
> > > You can use the xlsread function to parse that matrix in to numeric and text data.  I copied your example in to a CSV file and used the following command:
>
> > > [num,txt] = xlsread('Book1.csv');
>
> > > This resulted in two variables, num and txt, that allow you to work with them independently.  This does take a little bit of keeping track because the sizes of the variables are different.  num is a 4x7 double and txt is a 5x9 cell.  It's up to you at that point to keep track of how things should line up between the two.  In the past I've played games with the num matrix to add rows and columns of zeros or ones to make the dimension the same size as my original.  It's considerably easier to keep track of that way.
>
> > > A word of caution though ...  xlsread isn't the fastest I/O function and is intended for Excel files.  csvread would be faster but I've never tried that command with mixed character and numeric files.  You could look in to that and see what happens.
>
> > > Hope that helps.  Good luck!
> > > AJ
>
> > No, csvread won't work.  It only works on numeric data.
> > And xlsread is a long-winded way to do it.
>
> > But I'm puzzled.....
> > What you showed is not a normal .csv file.  It should look like this:
> > SYMBOL,DATE,TIME,PRICE,SIZE,G127,CORR,COND,EX
> > SPY,19980102,09:31:41,97.3125,53500,0,0,A
> > In other words, there should be commas between each field - that's
> > what csv stands for: "comma separated values".
>
> > What are the delimiters in your case?  Are they spaces or tabs?
> > In either case, you should invest time in finding out how textscan
> > works.
> > It will do the job for you and it will be very fast.  There's a bit of
> > learning, but it will be worth it.
> > help textscan
>
> Hello Tideman, you are correct it does look like that but i copied how it looked like in excel in my original post. yes my delimiters are commas but when i try importing on csv for example using the import wizard it gives me a 360579 by 1 matrix which i know is not how it should look like. ive tried removing the columns i dont need in excel before importing and this seems to be better but then i need a way of converting the date format to something matlab can read. Also the time too as i need 5 minute obvservations for my project.I also tried changing the csv to text but i need to get rid of the non-numerical parts like the DATE, TIME etc) will that work? I tried what the other person said (xlsread) but i am getting a failure error about not being able to open excel server or something like that.
> Thanks for your continued help people and sorry in advance about my english, it is my first language but i just hate using matlab soo much it is distressing me

Don't worry about dates and times at this stage.
They are easily fixed once you get the data loaded.
Did you follow my advice and read help textscan?
Show us where you got to with textscan.
In other words, post your code - it should only be three lines:
Line 1: open the file using fopen
Line 2: load the data using textscan
Line 3: close the file using fclose