Prev: spm_select
Next: MATLAB code speed
From: Rune Allnor on 5 Mar 2010 14:54 On 5 Mar, 20:13, "James Tursa" <aclassyguy_with_a_k_not_...(a)hotmail.com> wrote: > s = mxCalloc(m, sizeof(*s)); > for( i=0; i<m; i++ ) { > pr[i] = s[i]; > } > mxFree(s); Those calloc/free calls are expensive. What about letting s be a local scalar and move the result to pr[i] after treating each row? I wouldn't be surprised if you end up saving >> 50% by doing that. Rune
From: James Tursa on 5 Mar 2010 16:43 Rune Allnor <allnor(a)tele.ntnu.no> wrote in message <44ae8909-03ae-4219-a6b7-14d9329ebabf(a)z35g2000yqd.googlegroups.com>... > On 5 Mar, 20:13, "James Tursa" > <aclassyguy_with_a_k_not_...(a)hotmail.com> wrote: > > > > s = mxCalloc(m, sizeof(*s)); > > for( i=0; i<m; i++ ) { > > pr[i] = s[i]; > > } > > mxFree(s); > > Those calloc/free calls are expensive. What about > letting s be a local scalar and move the result to > pr[i] after treating each row? > > I wouldn't be surprised if you end up saving >> 50% > by doing that. > > Rune I can try that. My thought was to traverse the R array only once, hence the s allocation. If I use only one scalar then I have to traverse the R array m times, which I expect to be slower than allocating s, but I haven't actually tried it yet. I will give it a shot ... James Tursa
From: James Tursa on 5 Mar 2010 16:56 "James Tursa" <aclassyguy_with_a_k_not_a_c(a)hotmail.com> wrote in message <hmrtu8$go7$1(a)fred.mathworks.com>... > Rune Allnor <allnor(a)tele.ntnu.no> wrote in message <44ae8909-03ae-4219-a6b7-14d9329ebabf(a)z35g2000yqd.googlegroups.com>... > > On 5 Mar, 20:13, "James Tursa" > > <aclassyguy_with_a_k_not_...(a)hotmail.com> wrote: > > > > > > > s = mxCalloc(m, sizeof(*s)); > > > for( i=0; i<m; i++ ) { > > > pr[i] = s[i]; > > > } > > > mxFree(s); > > > > Those calloc/free calls are expensive. What about > > letting s be a local scalar and move the result to > > pr[i] after treating each row? > > > > I wouldn't be surprised if you end up saving >> 50% > > by doing that. > > > > Rune > > I can try that. My thought was to traverse the R array only once, hence the s allocation. If I use only one scalar then I have to traverse the R array m times, which I expect to be slower than allocating s, but I haven't actually tried it yet. I will give it a shot ... > > James Tursa Here is the result, about 70% slower than my previous post using an allocated s. This is about what I would have expected given the multiple traverses of R involved. All that redundant memory access just kills the running times. #include "mex.h" void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[]) { mwSize i, j, m, n; mwSize s; double *pr, *R, *R0; double k; m = mxGetM(prhs[0]); n = mxGetN(prhs[0]); R0 = mxGetPr(prhs[0]); k = mxGetScalar(prhs[1]); plhs[0] = mxCreateDoubleMatrix(m, 1, mxREAL); pr = mxGetPr(plhs[0]); for( i=0; i<m; i++ ) { R = R0++; s = 0; for( j=0; j<n; j++ ) { if( *R < k ) { ++s; } if( j < n-1 ) R += m; } pr[i] = s; } } James Tursa
From: Rune Allnor on 5 Mar 2010 17:07 On 5 Mar, 22:56, "James Tursa" <aclassyguy_with_a_k_not_...(a)hotmail.com> wrote: > "James Tursa" <aclassyguy_with_a_k_not_...(a)hotmail.com> wrote in message <hmrtu8$go...(a)fred.mathworks.com>... > > Rune Allnor <all...(a)tele.ntnu.no> wrote in message <44ae8909-03ae-4219-a6b7-14d9329eb...(a)z35g2000yqd.googlegroups.com>... > > > On 5 Mar, 20:13, "James Tursa" > > > <aclassyguy_with_a_k_not_...(a)hotmail.com> wrote: > > > > > s = mxCalloc(m, sizeof(*s)); > > > > for( i=0; i<m; i++ ) { > > > > pr[i] = s[i]; > > > > } > > > > mxFree(s); > > > > Those calloc/free calls are expensive. What about > > > letting s be a local scalar and move the result to > > > pr[i] after treating each row? > > > > I wouldn't be surprised if you end up saving >> 50% > > > by doing that. > > > > Rune > > > I can try that. My thought was to traverse the R array only once, hence the s allocation. If I use only one scalar then I have to traverse the R array m times, which I expect to be slower than allocating s, but I haven't actually tried it yet. I will give it a shot ... > > > James Tursa > > Here is the result, about 70% slower than my previous post using an allocated s. This is about what I would have expected given the multiple traverses of R involved. All that redundant memory access just kills the running times. You traverse the rows? I didn't see that. I assumed you traversed the columns. Traversing the columns one could use the scalar local variable at the same time there would be no need to traverse the array more than once. Rune
From: James Tursa on 5 Mar 2010 18:17
Rune Allnor <allnor(a)tele.ntnu.no> wrote in message <8f55e079-2b76-4694-842f-5f8d21da55af(a)q21g2000yqm.googlegroups.com>... > On 5 Mar, 22:56, "James Tursa" > <aclassyguy_with_a_k_not_...(a)hotmail.com> wrote: > > "James Tursa" <aclassyguy_with_a_k_not_...(a)hotmail.com> wrote in message <hmrtu8$go...(a)fred.mathworks.com>... > > > Rune Allnor <all...(a)tele.ntnu.no> wrote in message <44ae8909-03ae-4219-a6b7-14d9329eb...(a)z35g2000yqd.googlegroups.com>... > > > > On 5 Mar, 20:13, "James Tursa" > > > > <aclassyguy_with_a_k_not_...(a)hotmail.com> wrote: > > > > > > > s = mxCalloc(m, sizeof(*s)); > > > > > for( i=0; i<m; i++ ) { > > > > > pr[i] = s[i]; > > > > > } > > > > > mxFree(s); > > > > > > Those calloc/free calls are expensive. What about > > > > letting s be a local scalar and move the result to > > > > pr[i] after treating each row? > > > > > > I wouldn't be surprised if you end up saving >> 50% > > > > by doing that. > > > > > > Rune > > > > > I can try that. My thought was to traverse the R array only once, hence the s allocation. If I use only one scalar then I have to traverse the R array m times, which I expect to be slower than allocating s, but I haven't actually tried it yet. I will give it a shot ... > > > > > James Tursa > > > > Here is the result, about 70% slower than my previous post using an allocated s. This is about what I would have expected given the multiple traverses of R involved. All that redundant memory access just kills the running times. > > You traverse the rows? I didn't see that. I assumed > you traversed the columns. Traversing the columns one > could use the scalar local variable at the same time > there would be no need to traverse the array more than > once. > > Rune Yes. That issue was brought up earlier by another poster. If OP could rearrange his data as the transpose, then one could do what you suggest. James Tursa |