From: webmasterATflymagnetic.com on 24 Jun 2010 06:29 Hi, I use the following function to return a text file as a list: (defun read-file (file) (with-open-file (stream file) (do ((retval) (line (read-line stream nil) (read-line stream nil))) ((null line) retval) (setf retval (append retval (list line)))))) I have three files I've tested this on: Name Size Lines file1 9,223 214 file2 2,116,283 31,342 file3 5,648,093 105,311 [10]> (time (progn (read-file "file1") nil)) Real time: 0.010914 sec. Run time: 0.012001 sec. Space: 201712 Bytes GC: 1, GC time: 0.008001 sec. [11]> (time (progn (read-file "file2") nil)) Real time: 41.51742 sec. Run time: 41.062565 sec. Space: 3931805312 Bytes GC: 2268, GC time: 29.117828 sec. NIL [12]> (time (progn (read-file "file3") nil)) Real time: 610.77686 sec. Run time: 595.9412 sec. Space: 44368777352 Bytes GC: 23409, GC time: 443.23578 sec. NIL So the largest file, with 105,000 lines (5.6M) took nearly 10 minutes. I accept that there is some work to do to get it into a list format, but clearly the way I have approached this must be wrong. This sort of performance is abysmal. I would be interested in people's views on how I should be doing this, and comments as to why what I am seeing is so poor. Yes it works, and as I'm still relatively new to Lisp this is the way I would see Lisp interacting with the outside world -- ie get the data into a list and then it can be worked on using the normal Lisp functions. I strongly suspect I must be missing something, but can't see what it is. My code will need to work with files sometimes much larger than 5MB, but clearly this approach will not work. So any advice is very much appreciated. Phil
From: szabolcs.szucs on 24 Jun 2010 06:44 Append is the key. Because of you collecting lines into a list which is a simply linked one (afaik) when appending to the end, you have to iterate over the list every time. This makes the solution O(n^2).. A somewhat better solution is to (cons line retval) and return (reverse retval) which makes stuff linear (O(n)), although consider using arrays. Cheers, Sz On Jun 24, 12:29 pm, "webmasterATflymagnetic.com" <webmas...(a)flymagnetic.com> wrote: > Hi, > > I use the following function to return a text file as a list: > > (defun read-file (file) > (with-open-file (stream file) > (do ((retval) > (line (read-line stream nil) > (read-line stream nil))) > ((null line) retval) > (setf retval (append retval (list line)))))) > > I have three files I've tested this on: > > Name Size Lines > file1 9,223 214 > file2 2,116,283 31,342 > file3 5,648,093 105,311 > > [10]> (time (progn (read-file "file1") nil)) > Real time: 0.010914 sec. > Run time: 0.012001 sec. > Space: 201712 Bytes > GC: 1, GC time: 0.008001 sec. > [11]> (time (progn (read-file "file2") nil)) > Real time: 41.51742 sec. > Run time: 41.062565 sec. > Space: 3931805312 Bytes > GC: 2268, GC time: 29.117828 sec. > NIL > [12]> (time (progn (read-file "file3") nil)) > Real time: 610.77686 sec. > Run time: 595.9412 sec. > Space: 44368777352 Bytes > GC: 23409, GC time: 443.23578 sec. > NIL > > So the largest file, with 105,000 lines (5.6M) took nearly 10 minutes. > I accept that there is some work to do to get it into a list format, > but clearly the way I have approached this must be wrong. This sort of > performance is abysmal. > > I would be interested in people's views on how I should be doing this, > and comments as to why what I am seeing is so poor. > > Yes it works, and as I'm still relatively new to Lisp this is the way > I would see Lisp interacting with the outside world -- ie get the data > into a list and then it can be worked on using the normal Lisp > functions. I strongly suspect I must be missing something, but can't > see what it is. My code will need to work with files sometimes much > larger than 5MB, but clearly this approach will not work. > > So any advice is very much appreciated. > > Phil
From: Zach Beane on 24 Jun 2010 06:46 "webmasterATflymagnetic.com" <webmaster(a)flymagnetic.com> writes: > Hi, > > I use the following function to return a text file as a list: > > (defun read-file (file) > (with-open-file (stream file) > (do ((retval) > (line (read-line stream nil) > (read-line stream nil))) > ((null line) retval) > (setf retval (append retval (list line)))))) > > I have three files I've tested this on: > > Name Size Lines > file1 9,223 214 > file2 2,116,283 31,342 > file3 5,648,093 105,311 > > [10]> (time (progn (read-file "file1") nil)) > Real time: 0.010914 sec. > Run time: 0.012001 sec. > Space: 201712 Bytes > GC: 1, GC time: 0.008001 sec. > [11]> (time (progn (read-file "file2") nil)) > Real time: 41.51742 sec. > Run time: 41.062565 sec. > Space: 3931805312 Bytes > GC: 2268, GC time: 29.117828 sec. > NIL > [12]> (time (progn (read-file "file3") nil)) > Real time: 610.77686 sec. > Run time: 595.9412 sec. > Space: 44368777352 Bytes > GC: 23409, GC time: 443.23578 sec. > NIL > > So the largest file, with 105,000 lines (5.6M) took nearly 10 minutes. > I accept that there is some work to do to get it into a list format, > but clearly the way I have approached this must be wrong. This sort of > performance is abysmal. > > I would be interested in people's views on how I should be doing this, > and comments as to why what I am seeing is so poor. That output looks like GNU CLISP. Did you compile the function before timing it? Zach
From: Teemu Likonen on 24 Jun 2010 06:55 * 2010-06-24 03:29 (-0700), webmasterATflymagnetic com wrote: > (defun read-file (file) > (with-open-file (stream file) > (do ((retval) > (line (read-line stream nil) > (read-line stream nil))) > ((null line) retval) > (setf retval (append retval (list line)))))) It's _much_ faster to add items to the front of list. So, instead of (setf retval (append retval (list line))) You should do (push line retval) and then reverse the list when exiting the loop: (nreverse retval) The function could look like this: (defun read-file (file) (with-open-file (stream file) (do (retval (line (read-line stream nil) (read-line stream nil))) ((null line) (nreverse retval)) (push line retval)))) Although I usually prefer the LOOP macro - (defun read-file (file) (with-open-file (stream file) (loop for line = (read-line stream nil) while line collect line))) - which automatically uses fast methods for building lists.
From: Rainer Joswig on 24 Jun 2010 07:05 "webmasterATflymagnetic.com" <webmaster(a)flymagnetic.com> wrote: > Hi, .... > (setf retval (append retval (list line)))))) > one of the most common mistakes when working with singly linked lists made of cons cells is to append to the end of the list - in loops and recursive calls. The usual approaches to work around this: - use a data structure that does not have this problem - cons to the front of the list and at after the loop reverse the list once Or you can use also LOOP instead of DO: (loop for line = (read-line stream nil nil) while line collect line)
|
Next
|
Last
Pages: 1 2 3 4 Prev: hunchentoot doesn't show a picture Next: Emacs Form Feed (^L) Display Suggestion and Tips |