5-minute observations from data [Matlab]

Prev: Sorting Cell Array
Next: help me with this in simulink: 1000 consecutive zero crossings

From: Ayotunde on 14 May 2010 14:43

Basically i've got data consisting of observations on the S&P index for 6 years which for example for year 1999, the csv looks like SYMBOL,DATE,TIME,PRICE,SIZE,G127,CORR,COND,EX
> SPY,19980102,09:31:41,97.3125,53500,0,0,,A
> SPY,19980102,09:31:43,97.3125,100,0,0,,M
> SPY,19980102,09:31:43,97.3125,200,0,0,,M
> SPY,19980102,09:31:44,97.3125,4800,0,0,,P
> SPY,19980102,09:31:44,97.3125,100,0,0,,M
> SPY,19980102,09:31:44,97.3125,100,0,0,,M
> SPY,19980102,09:31:44,97.3125,500,0,0,,M
> SPY,19980102,09:31:44,97.3125,200,0,0,,M
> SPY,19980102,09:31:44,97.3125,100,0,0,,M
> SPY,19980102,09:31:44,97.3125,100,0,0,,M
> SPY,19980102,09:31:44,97.3125,500,0,0,,M
> SPY,19980102,09:31:44,97.3125,1000,0,0,,M
> SPY,19980102,09:31:44,97.3125,200,0,0,,M
> SPY,19980102,09:31:44,97.3125,200,0,0,,M
> SPY,19980102,09:31:44,97.3125,100,0,0,,M
> SPY,19980102,09:31:44,97.3125,100,0,0,,M
> SPY,19980102,09:31:44,97.3125,100,0,0,,M
> SPY,19980102,09:31:44,97.3125,200,0,0,,M
> SPY,19980102,09:31:44,97.3125,1000,0,0,,M
> SPY,19980102,09:31:45,97.3125,900,0,0,,M
> SPY,19980102,09:31:45,97.3125,300,0,0,,M
> SPY,19980102,09:31:45,97.3125,200,0,0,,M
> SPY,19980102,09:31:45,97.3125,300,0,0,,M
> SPY,19980102,09:31:52,97.3125,200,0,0,,T
> SPY,19980102,09:32:03,97.4375,1000,0,0,,A
> SPY,19980102,09:32:11,97.4375,1000,0,0,,A
> SPY,19980102,09:32:20,97.3125,100,0,0,Z,P
> SPY,19980102,09:32:21,97.3125,200,0,0,,T
> SPY,19980102,09:32:21,97.3125,100,0,0,,T
> SPY,19980102,09:32:21,97.3125,200,0,0,,T
> SPY,19980102,09:32:21,97.3125,100,0,0,,T
> SPY,19980102,09:32:21,97.3125,100,0,0,,T
> SPY,19980102,09:32:21,97.3125,300,0,0,,T
> SPY,19980102,09:32:21,97.3125,100,0,0,,T
> SPY,19980102,09:32:21,97.3125,500,0,0,,T
and with help from another thread i posted up, i have used textscan and datenum and combined the matrix of serial date numbers i got from datenum and the matrix of prices i need to get something that looks like this
729757.397002315 97.3125000000000
729757.397025463 97.3125000000000
729757.397025463 97.3125000000000
729757.397037037 97.3125000000000
729757.397037037 97.3125000000000
729757.397037037 97.3125000000000
729757.397037037 97.3125000000000
729757.397037037 97.3125000000000
729757.397037037 97.3125000000000
729757.397037037 97.3125000000000
729757.397037037 97.3125000000000
729757.397037037 97.3125000000000
729757.397037037 97.3125000000000
729757.397037037 97.3125000000000
729757.397037037 97.3125000000000
729757.397037037 97.3125000000000
729757.397037037 97.3125000000000
729757.397037037 97.3125000000000
729757.397037037 97.3125000000000
729757.397048611 97.3125000000000
729757.397048611 97.3125000000000
729757.397048611 97.3125000000000
729757.397048611 97.3125000000000
729757.397129630 97.3125000000000
729757.397256944 97.4375000000000
729757.397349537 97.4375000000000
729757.397453704 97.3125000000000
729757.397465278 97.3125000000000
729757.397465278 97.3125000000000
729757.397465278 97.3125000000000
my question is that i am trying to replicate the results a paper (article) and they have used 5 minute observations however as is visible from the example of the csv, my data is more frequent than every 5 minutes. How do i go about extracting data from every 5 minutes from about 9.30 am when observations start, till 4.20 pm. each csv i have is for a whole year as mentioned above so i'd like to do this for the whole year.
thnx in advance

From: us on 14 May 2010 15:01

"Ayotunde "
> my question is that i am trying to replicate the results a paper (article) and they have used 5 minute observations however as is visible from the example of the csv, my data is more frequent than every 5 minutes. How do i go about extracting data from every 5 minutes from about 9.30 am when observations start, till 4.20 pm. each csv i have is for a whole year as mentioned above so i'd like to do this for the whole year.
> thnx in advance

a hint:

help histc;

us

From: Ayotunde on 15 May 2010 18:30

"us " <us(a)neurol.unizh.ch> wrote in message <hsk6lv$7lp$1(a)fred.mathworks.com>...
> "Ayotunde "
> > my question is that i am trying to replicate the results a paper (article) and they have used 5 minute observations however as is visible from the example of the csv, my data is more frequent than every 5 minutes. How do i go about extracting data from every 5 minutes from about 9.30 am when observations start, till 4.20 pm. each csv i have is for a whole year as mentioned above so i'd like to do this for the whole year.
> > thnx in advance
>
> a hint:
>
> help histc;
>
> us

Thank you for your hint, i still don't see how histc will help me though. i havn't come across an example that uses time for the "edges". Also i've tried but i cannot deduce how i can use histc to create a new array with data from every 5 minutes

From: ImageAnalyst on 15 May 2010 18:38

I'd probably first remove duplicated rows. Then I'd need to figure
out what fraction corresponds to 5 minutes. Then I'd probably use
interp1() to resample at exactly 5 minute intervals. Sound reasonable?

From: us on 15 May 2010 18:50

"Ayotunde " <rhymer2k(a)yahoo.co.uk> wrote in message <hsn7a1$604$1(a)fred.mathworks.com>...
> "us " <us(a)neurol.unizh.ch> wrote in message <hsk6lv$7lp$1(a)fred.mathworks.com>...
> > "Ayotunde "
> > > my question is that i am trying to replicate the results a paper (article) and they have used 5 minute observations however as is visible from the example of the csv, my data is more frequent than every 5 minutes. How do i go about extracting data from every 5 minutes from about 9.30 am when observations start, till 4.20 pm. each csv i have is for a whole year as mentioned above so i'd like to do this for the whole year.
> > > thnx in advance
> >
> > a hint:
> >
> > help histc;
> >
> > us
>
> Thank you for your hint, i still don't see how histc will help me though. i havn't come across an example that uses time for the "edges". Also i've tried but i cannot deduce how i can use histc to create a new array with data from every 5 minutes

well then... a small example...

one of the solutions

% the data
% - date strings...
d0=datenum(now);
ds=datestr(datenum(d0)+(0:10)/24); % <- D0 + 1hr * (0:10)
% the engine
t0=datenum(d0); % <- D0
ts=datenum(t0+(0:10)/(.5*24)); % <- D0 + 2hr * (0:10) [there will be slack!]
td=datenum(ds); % <- DS converted to DATENUMs
[tx,tn]=histc(td,ts);
% the result
disp([tx.';tn.']);
%{
2 2 2 2 2 0 0 0 0 0 0 % <- #obs of TD in TS
0 1 1 2 2 3 3 4 4 5 5 % <- index into TS
%}

us

| Next | Last
Pages: 1 2 3 4
Prev: Sorting Cell Array
Next: help me with this in simulink: 1000 consecutive zero crossings