Prev: Question on ML04 Calculator Exercises from Mathworks Training Services
Next: Apply Mask To Create Average of Nearby Pixel Values + Parallelize
From: Rune Allnor on 3 Feb 2010 09:38 On 3 Feb, 00:02, Luna Moon <lunamoonm...(a)gmail.com> wrote: > Hi all, > > I have a big Excel file that I need to read into Mablab. > > It has 40 sheets, and iteratively for each sheet, I need to read in a > bunch of stuff as the following: > > [tmp cellstrDates]=xlsread1(strDataFileName, strCurrentSheet, > 'B23:B65535'); > > Because the content on the sheets will keep growing, so there is no > way for me to know how many rows there are on the sheets (each sheet > may contain different number of rows). That's why I have to put 65535 > above. > > As you can see, I've also tried "xlsread1" from Mathwork File > Exchange, it didn't speed up too much for me. > > The whole process takes about tens of hours for me... Reduce the number of file accesses. Load everything up in memory *once*, do *all* the processing with the data in memory, and only then write the results back to file, *once*. There are several things to consider: 1) File acces is *slow*. A lot slower than RAM access. 2) Changing even the tinyest detail in a file, would cause *all* the file to be copied (i.e. read and written). So if you keep changing individual details, and write the changed details back to file every time, you cause insane numbers of file copies to be made. 3) Even as files go, MSOffice / Excel file access is ridiculously slow. To what extent I have used it, handling an MSOffice file through the COM interface takes as many minutes as other programs would use seconds. Rune |