How to achieve ultimate summing speed? [Matlab]

Prev: spm_select
Next: MATLAB code speed

From: Jean-Francois on 5 Mar 2010 12:05

Hi,

I'm stuck with the following problem:

I have a large matrix R, say of size 2000 by 500, and a threshold value k. For each row, I seek the numbers of elements that are below k. Hence, the output would be a 2000 by 1 vector.

So far, I've used "Q = SUM( R < k , 2 )", which proved to be much faster than everything I could think of, e.g. HISTC or (obviously) SORT+DIFF. A single calculation is in the order of milliseconds, so it wouldn't seem like a big deal. However, in the context of a Monte Carlo simulation, that summation turns out to be a huge bottleneck (> 60%); hence, even the slightest increase in efficiency would be very valuable.

Thanks for your suggestion.

From: us on 5 Mar 2010 12:15

From: Walter Roberson on 5 Mar 2010 12:25

Jean-Francois wrote:

> I'm stuck with the following problem:
>
> I have a large matrix R, say of size 2000 by 500, and a threshold value
> k. For each row, I seek the numbers of elements that are below k. Hence,
> the output would be a 2000 by 1 vector.
>
> So far, I've used "Q = SUM( R < k , 2 )", which proved to be much
> faster than everything I could think of, e.g. HISTC or (obviously)
> SORT+DIFF. A single calculation is in the order of milliseconds, so it
> wouldn't seem like a big deal. However, in the context of a Monte Carlo
> simulation, that summation turns out to be a huge bottleneck (> 60%);
> hence, even the slightest increase in efficiency would be very valuable.

If you are looking for minor increases in efficiency, then consider
building your R matrix sideways. Summing along columns is slightly
faster than summing along rows, because Matlab data is stored down
columns rather than being stored across rows.

From: Rune Allnor on 5 Mar 2010 13:57

On 5 Mar, 18:05, "Jean-Francois " <jk...(a)princeton.edu> wrote:
> Hi,
>
> I'm stuck with the following problem:
>
> I have a large matrix R, say of size 2000 by 500, and a threshold value k.. For each row, I seek the numbers of elements that are below k. Hence, the output would be a 2000 by 1 vector.
>
> So far, I've used "Q = SUM( R < k , 2 )", which proved to be much faster than everything I could think of, e.g. HISTC or (obviously) SORT+DIFF. A single calculation is in the order of milliseconds, so it wouldn't seem like a big deal. However, in the context of a Monte Carlo simulation, that summation turns out to be a huge bottleneck (> 60%); hence, even the slightest increase in efficiency would be very valuable.

1) Make sure your vectors are stored sequentially in memory
2) MEX the routine
3) Make sure to optimize the compiler

Rune

*Very* rudimentary - no safeguards; not at all tested - C++ code:

#include "mex.h"

void mexFunction(
int nlhs,
mxArray *plhs[],
int nrhs,
const mxArray *prhs[]
)
{
const size_t M(mxGetM(prhs[0]));
const size_t N(mxGetN(prhs[0]));
const double* const p(mxGetPr(prhs[0]));
const double k((mxGetPr(prhs[1]))[0]);
plhs[0] = mxCreateNumericMatrix(N,1,mxUINT32_CLASS,mxREAL);
size_t* d((size_t*)mxGetData(plhs[0]));
for (size_t m = 0; m<M; ++m)
{
size_t n = 0;
double K = 0;
const double* const q(p+m*M);
while((n < N) && (K < k))
{
K+=q[n];
++n;
}
d[m]=--n;
}
}

From: James Tursa on 5 Mar 2010 14:13

"Jean-Francois " <jkagy(a)princeton.edu> wrote in message <hmrdko$b6n$1(a)fred.mathworks.com>...
> Hi,
>
> I'm stuck with the following problem:
>
> I have a large matrix R, say of size 2000 by 500, and a threshold value k. For each row, I seek the numbers of elements that are below k. Hence, the output would be a 2000 by 1 vector.
>
> So far, I've used "Q = SUM( R < k , 2 )", which proved to be much faster than everything I could think of, e.g. HISTC or (obviously) SORT+DIFF. A single calculation is in the order of milliseconds, so it wouldn't seem like a big deal. However, in the context of a Monte Carlo simulation, that summation turns out to be a huge bottleneck (> 60%); hence, even the slightest increase in efficiency would be very valuable.
>
> Thanks for your suggestion.

Here is a slightly faster version that does the summing as integers instead of floating point. My tests show a 20% - 25% speed improvement for this version.

#include "mex.h"

void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
mwSize i, j, m, n;
mwSize *s;
double *pr, *R;
double k;

m = mxGetM(prhs[0]);
n = mxGetN(prhs[0]);
R = mxGetPr(prhs[0]);
k = mxGetScalar(prhs[1]);
plhs[0] = mxCreateDoubleMatrix(m, 1, mxREAL);
pr = mxGetPr(plhs[0]);
s = mxCalloc(m, sizeof(*s));
for( j=0; j<n; j++ ) {
for( i=0; i<m; i++ ) {
if( (*R++) < k ) {
++s[i];
}
}
}
for( i=0; i<m; i++ ) {
pr[i] = s[i];
}
mxFree(s);
}

James Tursa

| Next | Last
Pages: 1 2 3 4 5 6 7
Prev: spm_select
Next: MATLAB code speed