From: trave Traverso on
Hi All,
I need to create lots of files, which I subsequently need to open, do some calculations and save back. Essentially I would like to improve the speed of the following for loop:
clear all
tstart = tic;
N = 100000;
P = 10000;
for k = 1 : P
vecID = k;
% calc
out = rand(N,1);
% output files
string = ['.\temp\out' int2str(vecID) '.bin'];
fid = fopen(string, 'w');
fwrite(fid,out,'float32');
fclose(fid);
clear out
end
toc(tstart)
It takes 390 sec overall and 244 sec for the fwrite alone. As I need to do this several times I really need to speed the fwrite somehow. Can you help anyhow?

Best wishes

Luca
From: Rune Allnor on
On 17 apr, 12:54, "trave Traverso" <trav...(a)gmail.com> wrote:
> Hi All,
> I need to create lots of files, which I  subsequently need to open, do some calculations and save back. Essentially I would like to improve the speed of the following for loop:
> clear all
> tstart = tic;
> N = 100000;
> P = 10000;
> for k = 1 : P
>     vecID = k;
>     % calc
>     out = rand(N,1);
>     % output files
>     string = ['.\temp\out' int2str(vecID) '.bin'];
>     fid = fopen(string, 'w');
>     fwrite(fid,out,'float32');
>     fclose(fid);
>     clear out
> end
> toc(tstart)
> It takes 390 sec overall and 244 sec for the fwrite alone. As I need to do this several times I really need to speed the fwrite somehow. Can you help anyhow?

Accessing the file system *is* slow. The only way to improve speed
is to reduce the number of file accesses. Instead of writing a lot
of small data batches, write only a few large ones. Reduce the number
of openings and closings of files. Make sure you write to a local
fast disk and not across slow networks or USB links.

Those kinds of things.

One thing that might have some effect, is if you typecast the vector
'out' to float up front. That way you reduce the number of bits to
shuffle by half, and FWRITE needs not (or at least ought not have to)
worry about typecasting data to the specified binary format.

Rune
From: trave Traverso on
Rune Allnor <allnor(a)tele.ntnu.no> wrote in message <35edff48-a365-4549-896d-7371a298d0b8(a)w42g2000yqm.googlegroups.com>...
> On 17 apr, 12:54, "trave Traverso" <trav...(a)gmail.com> wrote:
> > Hi All,
> > I need to create lots of files, which I  subsequently need to open, do some calculations and save back. Essentially I would like to improve the speed of the following for loop:
> > clear all
> > tstart = tic;
> > N = 100000;
> > P = 10000;
> > for k = 1 : P
> >     vecID = k;
> >     % calc
> >     out = rand(N,1);
> >     % output files
> >     string = ['.\temp\out' int2str(vecID) '.bin'];
> >     fid = fopen(string, 'w');
> >     fwrite(fid,out,'float32');
> >     fclose(fid);
> >     clear out
> > end
> > toc(tstart)
> > It takes 390 sec overall and 244 sec for the fwrite alone. As I need to do this several times I really need to speed the fwrite somehow. Can you help anyhow?
>
> Accessing the file system *is* slow. The only way to improve speed
> is to reduce the number of file accesses. Instead of writing a lot
> of small data batches, write only a few large ones. Reduce the number
> of openings and closings of files. Make sure you write to a local
> fast disk and not across slow networks or USB links.
>
> Those kinds of things.
>
> One thing that might have some effect, is if you typecast the vector
> 'out' to float up front. That way you reduce the number of bits to
> shuffle by half, and FWRITE needs not (or at least ought not have to)
> worry about typecasting data to the specified binary format.
>
> Rune

thanks Rune,
I will have look into your suggestions. Just one thing, do you think that creating a fortran or C MEX function for opening and writing to file would improve performance? I was considering going down this path but I am not sure it is actually worth doing.
Thanks again for your help,
Luca
From: Rune Allnor on
On 17 apr, 17:58, "trave Traverso" <trav...(a)gmail.com> wrote:
> Just one thing, do you think that creating a fortran or C MEX function for opening and writing to file would improve performance?

Not what file IO is concerned, no. The problem is the hardware
bottlenecks, that are the same for each program on any given
system. MEX only affects what goes on inside the program.
With the example you posted, I doubt MEXing will have any
measurable effect.

Rune