From: Jean-Francois on
Hi,

I'm stuck with the following problem:

I have a large matrix R, say of size 2000 by 500, and a threshold value k. For each row, I seek the numbers of elements that are below k. Hence, the output would be a 2000 by 1 vector.

So far, I've used "Q = SUM( R < k , 2 )", which proved to be much faster than everything I could think of, e.g. HISTC or (obviously) SORT+DIFF. A single calculation is in the order of milliseconds, so it wouldn't seem like a big deal. However, in the context of a Monte Carlo simulation, that summation turns out to be a huge bottleneck (> 60%); hence, even the slightest increase in efficiency would be very valuable.

Thanks for your suggestion.
From: us on
"Jean-Francois " <jkagy(a)princeton.edu> wrote in message <hmrdko$b6n$1(a)fred.mathworks.com>...
> Hi,
>
> I'm stuck with the following problem:
>
> I have a large matrix R, say of size 2000 by 500, and a threshold value k. For each row, I seek the numbers of elements that are below k. Hence, the output would be a 2000 by 1 vector.
>
> So far, I've used "Q = SUM( R < k , 2 )", which proved to be much faster than everything I could think of, e.g. HISTC or (obviously) SORT+DIFF. A single calculation is in the order of milliseconds, so it wouldn't seem like a big deal. However, in the context of a Monte Carlo simulation, that summation turns out to be a huge bottleneck (> 60%); hence, even the slightest increase in efficiency would be very valuable.
>
> Thanks for your suggestion.

before we get started: two important questions...
- does this run in a loop
- does your K change, eg, in the loop or according to some rule...

us
From: Walter Roberson on
Jean-Francois wrote:

> I'm stuck with the following problem:
>
> I have a large matrix R, say of size 2000 by 500, and a threshold value
> k. For each row, I seek the numbers of elements that are below k. Hence,
> the output would be a 2000 by 1 vector.
>
> So far, I've used "Q = SUM( R < k , 2 )", which proved to be much
> faster than everything I could think of, e.g. HISTC or (obviously)
> SORT+DIFF. A single calculation is in the order of milliseconds, so it
> wouldn't seem like a big deal. However, in the context of a Monte Carlo
> simulation, that summation turns out to be a huge bottleneck (> 60%);
> hence, even the slightest increase in efficiency would be very valuable.

If you are looking for minor increases in efficiency, then consider
building your R matrix sideways. Summing along columns is slightly
faster than summing along rows, because Matlab data is stored down
columns rather than being stored across rows.
From: Rune Allnor on
On 5 Mar, 18:05, "Jean-Francois " <jk...(a)princeton.edu> wrote:
> Hi,
>
> I'm stuck with the following problem:
>
> I have a large matrix R, say of size 2000 by 500, and a threshold value k.. For each row, I seek the numbers of elements that are below k. Hence, the output would be a 2000 by 1 vector.
>
> So far, I've used "Q  = SUM( R < k , 2 )", which proved to be much faster than everything I could think of, e.g. HISTC or (obviously) SORT+DIFF. A single calculation is in the order of milliseconds, so it wouldn't seem like a big deal. However, in the context of a Monte Carlo simulation, that summation turns out to be a huge bottleneck (> 60%); hence, even the slightest increase in efficiency would be very valuable.

1) Make sure your vectors are stored sequentially in memory
2) MEX the routine
3) Make sure to optimize the compiler

Rune

*Very* rudimentary - no safeguards; not at all tested - C++ code:

#include "mex.h"

void mexFunction(
int nlhs,
mxArray *plhs[],
int nrhs,
const mxArray *prhs[]
)
{
const size_t M(mxGetM(prhs[0]));
const size_t N(mxGetN(prhs[0]));
const double* const p(mxGetPr(prhs[0]));
const double k((mxGetPr(prhs[1]))[0]);
plhs[0] = mxCreateNumericMatrix(N,1,mxUINT32_CLASS,mxREAL);
size_t* d((size_t*)mxGetData(plhs[0]));
for (size_t m = 0; m<M; ++m)
{
size_t n = 0;
double K = 0;
const double* const q(p+m*M);
while((n < N) && (K < k))
{
K+=q[n];
++n;
}
d[m]=--n;
}
}
From: James Tursa on
"Jean-Francois " <jkagy(a)princeton.edu> wrote in message <hmrdko$b6n$1(a)fred.mathworks.com>...
> Hi,
>
> I'm stuck with the following problem:
>
> I have a large matrix R, say of size 2000 by 500, and a threshold value k. For each row, I seek the numbers of elements that are below k. Hence, the output would be a 2000 by 1 vector.
>
> So far, I've used "Q = SUM( R < k , 2 )", which proved to be much faster than everything I could think of, e.g. HISTC or (obviously) SORT+DIFF. A single calculation is in the order of milliseconds, so it wouldn't seem like a big deal. However, in the context of a Monte Carlo simulation, that summation turns out to be a huge bottleneck (> 60%); hence, even the slightest increase in efficiency would be very valuable.
>
> Thanks for your suggestion.

Here is a slightly faster version that does the summing as integers instead of floating point. My tests show a 20% - 25% speed improvement for this version.

#include "mex.h"

void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
mwSize i, j, m, n;
mwSize *s;
double *pr, *R;
double k;

m = mxGetM(prhs[0]);
n = mxGetN(prhs[0]);
R = mxGetPr(prhs[0]);
k = mxGetScalar(prhs[1]);
plhs[0] = mxCreateDoubleMatrix(m, 1, mxREAL);
pr = mxGetPr(plhs[0]);
s = mxCalloc(m, sizeof(*s));
for( j=0; j<n; j++ ) {
for( i=0; i<m; i++ ) {
if( (*R++) < k ) {
++s[i];
}
}
}
for( i=0; i<m; i++ ) {
pr[i] = s[i];
}
mxFree(s);
}

James Tursa
 |  Next  |  Last
Pages: 1 2 3 4 5 6 7
Prev: spm_select
Next: MATLAB code speed