From: Ayotunde on
Basically i've got data consisting of observations on the S&P index for 6 years which for example for year 1999, the csv looks like SYMBOL,DATE,TIME,PRICE,SIZE,G127,CORR,COND,EX
> SPY,19980102,09:31:41,97.3125,53500,0,0,,A
> SPY,19980102,09:31:43,97.3125,100,0,0,,M
> SPY,19980102,09:31:43,97.3125,200,0,0,,M
> SPY,19980102,09:31:44,97.3125,4800,0,0,,P
> SPY,19980102,09:31:44,97.3125,100,0,0,,M
> SPY,19980102,09:31:44,97.3125,100,0,0,,M
> SPY,19980102,09:31:44,97.3125,500,0,0,,M
> SPY,19980102,09:31:44,97.3125,200,0,0,,M
> SPY,19980102,09:31:44,97.3125,100,0,0,,M
> SPY,19980102,09:31:44,97.3125,100,0,0,,M
> SPY,19980102,09:31:44,97.3125,500,0,0,,M
> SPY,19980102,09:31:44,97.3125,1000,0,0,,M
> SPY,19980102,09:31:44,97.3125,200,0,0,,M
> SPY,19980102,09:31:44,97.3125,200,0,0,,M
> SPY,19980102,09:31:44,97.3125,100,0,0,,M
> SPY,19980102,09:31:44,97.3125,100,0,0,,M
> SPY,19980102,09:31:44,97.3125,100,0,0,,M
> SPY,19980102,09:31:44,97.3125,200,0,0,,M
> SPY,19980102,09:31:44,97.3125,1000,0,0,,M
> SPY,19980102,09:31:45,97.3125,900,0,0,,M
> SPY,19980102,09:31:45,97.3125,300,0,0,,M
> SPY,19980102,09:31:45,97.3125,200,0,0,,M
> SPY,19980102,09:31:45,97.3125,300,0,0,,M
> SPY,19980102,09:31:52,97.3125,200,0,0,,T
> SPY,19980102,09:32:03,97.4375,1000,0,0,,A
> SPY,19980102,09:32:11,97.4375,1000,0,0,,A
> SPY,19980102,09:32:20,97.3125,100,0,0,Z,P
> SPY,19980102,09:32:21,97.3125,200,0,0,,T
> SPY,19980102,09:32:21,97.3125,100,0,0,,T
> SPY,19980102,09:32:21,97.3125,200,0,0,,T
> SPY,19980102,09:32:21,97.3125,100,0,0,,T
> SPY,19980102,09:32:21,97.3125,100,0,0,,T
> SPY,19980102,09:32:21,97.3125,300,0,0,,T
> SPY,19980102,09:32:21,97.3125,100,0,0,,T
> SPY,19980102,09:32:21,97.3125,500,0,0,,T
and with help from another thread i posted up, i have used textscan and datenum and combined the matrix of serial date numbers i got from datenum and the matrix of prices i need to get something that looks like this
729757.397002315 97.3125000000000
729757.397025463 97.3125000000000
729757.397025463 97.3125000000000
729757.397037037 97.3125000000000
729757.397037037 97.3125000000000
729757.397037037 97.3125000000000
729757.397037037 97.3125000000000
729757.397037037 97.3125000000000
729757.397037037 97.3125000000000
729757.397037037 97.3125000000000
729757.397037037 97.3125000000000
729757.397037037 97.3125000000000
729757.397037037 97.3125000000000
729757.397037037 97.3125000000000
729757.397037037 97.3125000000000
729757.397037037 97.3125000000000
729757.397037037 97.3125000000000
729757.397037037 97.3125000000000
729757.397037037 97.3125000000000
729757.397048611 97.3125000000000
729757.397048611 97.3125000000000
729757.397048611 97.3125000000000
729757.397048611 97.3125000000000
729757.397129630 97.3125000000000
729757.397256944 97.4375000000000
729757.397349537 97.4375000000000
729757.397453704 97.3125000000000
729757.397465278 97.3125000000000
729757.397465278 97.3125000000000
729757.397465278 97.3125000000000
my question is that i am trying to replicate the results a paper (article) and they have used 5 minute observations however as is visible from the example of the csv, my data is more frequent than every 5 minutes. How do i go about extracting data from every 5 minutes from about 9.30 am when observations start, till 4.20 pm. each csv i have is for a whole year as mentioned above so i'd like to do this for the whole year.
thnx in advance
From: us on
"Ayotunde "
> my question is that i am trying to replicate the results a paper (article) and they have used 5 minute observations however as is visible from the example of the csv, my data is more frequent than every 5 minutes. How do i go about extracting data from every 5 minutes from about 9.30 am when observations start, till 4.20 pm. each csv i have is for a whole year as mentioned above so i'd like to do this for the whole year.
> thnx in advance

a hint:

help histc;

us
From: Ayotunde on
"us " <us(a)neurol.unizh.ch> wrote in message <hsk6lv$7lp$1(a)fred.mathworks.com>...
> "Ayotunde "
> > my question is that i am trying to replicate the results a paper (article) and they have used 5 minute observations however as is visible from the example of the csv, my data is more frequent than every 5 minutes. How do i go about extracting data from every 5 minutes from about 9.30 am when observations start, till 4.20 pm. each csv i have is for a whole year as mentioned above so i'd like to do this for the whole year.
> > thnx in advance
>
> a hint:
>
> help histc;
>
> us

Thank you for your hint, i still don't see how histc will help me though. i havn't come across an example that uses time for the "edges". Also i've tried but i cannot deduce how i can use histc to create a new array with data from every 5 minutes
From: ImageAnalyst on
I'd probably first remove duplicated rows. Then I'd need to figure
out what fraction corresponds to 5 minutes. Then I'd probably use
interp1() to resample at exactly 5 minute intervals. Sound reasonable?
From: us on
"Ayotunde " <rhymer2k(a)yahoo.co.uk> wrote in message <hsn7a1$604$1(a)fred.mathworks.com>...
> "us " <us(a)neurol.unizh.ch> wrote in message <hsk6lv$7lp$1(a)fred.mathworks.com>...
> > "Ayotunde "
> > > my question is that i am trying to replicate the results a paper (article) and they have used 5 minute observations however as is visible from the example of the csv, my data is more frequent than every 5 minutes. How do i go about extracting data from every 5 minutes from about 9.30 am when observations start, till 4.20 pm. each csv i have is for a whole year as mentioned above so i'd like to do this for the whole year.
> > > thnx in advance
> >
> > a hint:
> >
> > help histc;
> >
> > us
>
> Thank you for your hint, i still don't see how histc will help me though. i havn't come across an example that uses time for the "edges". Also i've tried but i cannot deduce how i can use histc to create a new array with data from every 5 minutes

well then... a small example...

one of the solutions

% the data
% - date strings...
d0=datenum(now);
ds=datestr(datenum(d0)+(0:10)/24); % <- D0 + 1hr * (0:10)
% the engine
t0=datenum(d0); % <- D0
ts=datenum(t0+(0:10)/(.5*24)); % <- D0 + 2hr * (0:10) [there will be slack!]
td=datenum(ds); % <- DS converted to DATENUMs
[tx,tn]=histc(td,ts);
% the result
disp([tx.';tn.']);
%{
2 2 2 2 2 0 0 0 0 0 0 % <- #obs of TD in TS
0 1 1 2 2 3 3 4 4 5 5 % <- index into TS
%}

us