From: Rune Allnor on
On 5 Mar, 20:13, "James Tursa"
<aclassyguy_with_a_k_not_...(a)hotmail.com> wrote:


>     s = mxCalloc(m, sizeof(*s));
>     for( i=0; i<m; i++ ) {
>         pr[i] = s[i];
>     }
>     mxFree(s);

Those calloc/free calls are expensive. What about
letting s be a local scalar and move the result to
pr[i] after treating each row?

I wouldn't be surprised if you end up saving >> 50%
by doing that.

Rune
From: James Tursa on
Rune Allnor <allnor(a)tele.ntnu.no> wrote in message <44ae8909-03ae-4219-a6b7-14d9329ebabf(a)z35g2000yqd.googlegroups.com>...
> On 5 Mar, 20:13, "James Tursa"
> <aclassyguy_with_a_k_not_...(a)hotmail.com> wrote:
>
>
> >     s = mxCalloc(m, sizeof(*s));
> >     for( i=0; i<m; i++ ) {
> >         pr[i] = s[i];
> >     }
> >     mxFree(s);
>
> Those calloc/free calls are expensive. What about
> letting s be a local scalar and move the result to
> pr[i] after treating each row?
>
> I wouldn't be surprised if you end up saving >> 50%
> by doing that.
>
> Rune

I can try that. My thought was to traverse the R array only once, hence the s allocation. If I use only one scalar then I have to traverse the R array m times, which I expect to be slower than allocating s, but I haven't actually tried it yet. I will give it a shot ...

James Tursa
From: James Tursa on
"James Tursa" <aclassyguy_with_a_k_not_a_c(a)hotmail.com> wrote in message <hmrtu8$go7$1(a)fred.mathworks.com>...
> Rune Allnor <allnor(a)tele.ntnu.no> wrote in message <44ae8909-03ae-4219-a6b7-14d9329ebabf(a)z35g2000yqd.googlegroups.com>...
> > On 5 Mar, 20:13, "James Tursa"
> > <aclassyguy_with_a_k_not_...(a)hotmail.com> wrote:
> >
> >
> > >     s = mxCalloc(m, sizeof(*s));
> > >     for( i=0; i<m; i++ ) {
> > >         pr[i] = s[i];
> > >     }
> > >     mxFree(s);
> >
> > Those calloc/free calls are expensive. What about
> > letting s be a local scalar and move the result to
> > pr[i] after treating each row?
> >
> > I wouldn't be surprised if you end up saving >> 50%
> > by doing that.
> >
> > Rune
>
> I can try that. My thought was to traverse the R array only once, hence the s allocation. If I use only one scalar then I have to traverse the R array m times, which I expect to be slower than allocating s, but I haven't actually tried it yet. I will give it a shot ...
>
> James Tursa

Here is the result, about 70% slower than my previous post using an allocated s. This is about what I would have expected given the multiple traverses of R involved. All that redundant memory access just kills the running times.

#include "mex.h"

void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
mwSize i, j, m, n;
mwSize s;
double *pr, *R, *R0;
double k;

m = mxGetM(prhs[0]);
n = mxGetN(prhs[0]);
R0 = mxGetPr(prhs[0]);
k = mxGetScalar(prhs[1]);
plhs[0] = mxCreateDoubleMatrix(m, 1, mxREAL);
pr = mxGetPr(plhs[0]);
for( i=0; i<m; i++ ) {
R = R0++;
s = 0;
for( j=0; j<n; j++ ) {
if( *R < k ) {
++s;
}
if( j < n-1 ) R += m;
}
pr[i] = s;
}
}

James Tursa
From: Rune Allnor on
On 5 Mar, 22:56, "James Tursa"
<aclassyguy_with_a_k_not_...(a)hotmail.com> wrote:
> "James Tursa" <aclassyguy_with_a_k_not_...(a)hotmail.com> wrote in message <hmrtu8$go...(a)fred.mathworks.com>...
> > Rune Allnor <all...(a)tele.ntnu.no> wrote in message <44ae8909-03ae-4219-a6b7-14d9329eb...(a)z35g2000yqd.googlegroups.com>...
> > > On 5 Mar, 20:13, "James Tursa"
> > > <aclassyguy_with_a_k_not_...(a)hotmail.com> wrote:
>
> > > >     s = mxCalloc(m, sizeof(*s));
> > > >     for( i=0; i<m; i++ ) {
> > > >         pr[i] = s[i];
> > > >     }
> > > >     mxFree(s);
>
> > > Those calloc/free calls are expensive. What about
> > > letting s be a local scalar and move the result to
> > > pr[i] after treating each row?
>
> > > I wouldn't be surprised if you end up saving >> 50%
> > > by doing that.
>
> > > Rune
>
> > I can try that. My thought was to traverse the R array only once, hence the s allocation. If I use only one scalar then I have to traverse the R array m times, which I expect to be slower than allocating s, but I haven't actually tried it yet. I will give it a shot ...
>
> > James Tursa
>
> Here is the result, about 70% slower than my previous post using an allocated s. This is about what I would have expected given the multiple traverses of R involved. All that redundant memory access just kills the running times.

You traverse the rows? I didn't see that. I assumed
you traversed the columns. Traversing the columns one
could use the scalar local variable at the same time
there would be no need to traverse the array more than
once.

Rune
From: James Tursa on
Rune Allnor <allnor(a)tele.ntnu.no> wrote in message <8f55e079-2b76-4694-842f-5f8d21da55af(a)q21g2000yqm.googlegroups.com>...
> On 5 Mar, 22:56, "James Tursa"
> <aclassyguy_with_a_k_not_...(a)hotmail.com> wrote:
> > "James Tursa" <aclassyguy_with_a_k_not_...(a)hotmail.com> wrote in message <hmrtu8$go...(a)fred.mathworks.com>...
> > > Rune Allnor <all...(a)tele.ntnu.no> wrote in message <44ae8909-03ae-4219-a6b7-14d9329eb...(a)z35g2000yqd.googlegroups.com>...
> > > > On 5 Mar, 20:13, "James Tursa"
> > > > <aclassyguy_with_a_k_not_...(a)hotmail.com> wrote:
> >
> > > > >     s = mxCalloc(m, sizeof(*s));
> > > > >     for( i=0; i<m; i++ ) {
> > > > >         pr[i] = s[i];
> > > > >     }
> > > > >     mxFree(s);
> >
> > > > Those calloc/free calls are expensive. What about
> > > > letting s be a local scalar and move the result to
> > > > pr[i] after treating each row?
> >
> > > > I wouldn't be surprised if you end up saving >> 50%
> > > > by doing that.
> >
> > > > Rune
> >
> > > I can try that. My thought was to traverse the R array only once, hence the s allocation. If I use only one scalar then I have to traverse the R array m times, which I expect to be slower than allocating s, but I haven't actually tried it yet. I will give it a shot ...
> >
> > > James Tursa
> >
> > Here is the result, about 70% slower than my previous post using an allocated s. This is about what I would have expected given the multiple traverses of R involved. All that redundant memory access just kills the running times.
>
> You traverse the rows? I didn't see that. I assumed
> you traversed the columns. Traversing the columns one
> could use the scalar local variable at the same time
> there would be no need to traverse the array more than
> once.
>
> Rune

Yes. That issue was brought up earlier by another poster. If OP could rearrange his data as the transpose, then one could do what you suggest.

James Tursa
First  |  Prev  |  Next  |  Last
Pages: 1 2 3 4 5 6 7
Prev: spm_select
Next: MATLAB code speed