a potential lisp convert, and interpreting the shootout [Lisp]

Prev: Pocket Lisp Machine
Next: Next Generation of Language

From: Henry Bigelow on 29 Sep 2006 13:36

hi paul,
i took a look at SPEC--it seems definitive, indeed, but for comparing
computer systems, not languages. is that right? would you be able to
point me in a direction to compare languages? i have a hunch that
there is something very wrong with the lisp benchmark which is about
100 times more memory/ or slower than a c program, but it'd be very
difficult for me to rewrite it more efficiently.

it's possible that for various reasons, people don't care enough about
the shootout

http://shootout.alioth.debian.org/

to fix the benchmarks. or, it's possible that the lisp ones are indeed
written to be as memory/speed efficient as possible.

what do you think is going on here?

thanks,

henry

> This is the first time I've heard of this particular competition (and
> set of benchmarks) -- it looks rather cool! I doubt it has the same
> clout as SPEC though... :-)
>
> Hey, while you are at learning lisp, why not debug that 120x memory
> program to see why this happens? profile and time are your friends...
> ;-)

you're overestimating my skill here!

>
> CL-USER>CL-USER> (time (main 5000000))
> 10000000
> Evaluation took:
> 34.466 seconds of real time
> 12.708794 seconds of user run time
> 21.369335 seconds of system run time
> [Run times include 0.044 seconds GC run time.]
> 0 page faults and
> 79,975,616 bytes consed.
> NIL
> CL-USER>

From: Javier on 29 Sep 2006 17:14

Henry Bigelow ha escrito:

> hi paul,
> i took a look at SPEC--it seems definitive, indeed, but for comparing
> computer systems, not languages. is that right? would you be able to
> point me in a direction to compare languages? i have a hunch that
> there is something very wrong with the lisp benchmark which is about
> 100 times more memory/ or slower than a c program, but it'd be very
> difficult for me to rewrite it more efficiently.
>
> it's possible that for various reasons, people don't care enough about
> the shootout
>
> http://shootout.alioth.debian.org/
>
> to fix the benchmarks. or, it's possible that the lisp ones are indeed
> written to be as memory/speed efficient as possible.
>
> what do you think is going on here?

We have already said that:

1) Lisp loads the entire compiler/debugger/documentation among with the
program. As you can imagine, it is impossible to compite to C in memory
ussage in this situation (imagine that you would have to load GCC with
your C program, it would be much bigger, don't think so?). But please,
take also in count that, while in a program written in C it does not
count the memory shared with the C library (it is linked dinamically),
in Lisp you have the Lisp library loaded with you, so the C library is
already consuming memory from your system, but the program which you
use to see that memory doesn't take that in count.
2) It is not 100x slower. Please take a look at:
http://shootout.alioth.debian.org/gp4/benchmark.php?test=all&lang=sbcl&lang2=gcc
That demonstrates that, except for some bechmarks in which algorithms
differ a lot, SBCL is almost as fast as C.
3) SBCL is not the only implementation. There are other implementations
which even run faster, and which consumes less memory.

From: Henry Bigelow on 29 Sep 2006 20:51

Javier wrote:

>
> We have already said that:
>
> 1) Lisp loads the entire compiler/debugger/documentation among with the
> program.

i didn't know that. but, if so, why aren't all the other benchmarks
100x more memory? k-nucleotide is only 3.1x more memory.

As you can imagine, it is impossible to compite to C in memory
> ussage in this situation (imagine that you would have to load GCC with
> your C program, it would be much bigger, don't think so?). But please,
> take also in count that, while in a program written in C it does not
> count the memory shared with the C library (it is linked dinamically),
> in Lisp you have the Lisp library loaded with you, so the C library is
> already consuming memory from your system, but the program which you
> use to see that memory doesn't take that in count.

> 2) It is not 100x slower. Please take a look at:
> http://shootout.alioth.debian.org/gp4/benchmark.php?test=all&lang=sbcl&lang2=gcc
> That demonstrates that, except for some bechmarks in which algorithms
> differ a lot, SBCL is almost as fast as C.

agreed. fannkuch is 27x slower, and chameneos requires 120x more
memory. i'm not contesting anything besides 'fannkuch' for speed. i'm
not really concerned about 'startup' because it's amortized.

> 3) SBCL is not the only implementation. There are other implementations
> which even run faster, and which consumes less memory.

yes, but the use of SBCL here is obviously not why the fannkuch
benchmark is 26 times slower, instead of 3 or 5 times slower.

thanks in advance,

henry

From: Javier on 29 Sep 2006 21:26

Henry Bigelow ha escrito:

> Javier wrote:
>
> >
> > We have already said that:
> >
> > 1) Lisp loads the entire compiler/debugger/documentation among with the
> > program.
>
> i didn't know that. but, if so, why aren't all the other benchmarks
> 100x more memory? k-nucleotide is only 3.1x more memory.

Because that algorithm uses a lot of memory, even in C.
If you don't believe me, download and install SBCL. Start it. Only
starting it, requires about 6Mb.
Now define a function:

(defun jj () (format t "hello"))

Now it requires about 11Mb, which means that the compiler has been
loaded and compiled your function in real time. If you start to do
things, like using the debugger, the documentation, extra libraries,
etc, your application can grow in memory very fast.
A similar thing happens with Java.
In C, you are not compiling code in real time, and your program uses
the libraries on your system, so this makes the ilussion that your C
program uses very little memory. But it probably no. What is happening
is that your C program uses libraries that are already loaded in memory
(for example stdlib).
You can see that this is always happening even in C. Take for example
Mozilla, Open Office, the X server, GCC... all of them are using a lot
of memory because they need libraries which are not already dinamically
loaded.

> > 2) It is not 100x slower. Please take a look at:
> > http://shootout.alioth.debian.org/gp4/benchmark.php?test=all&lang=sbcl&lang2=gcc
> > That demonstrates that, except for some bechmarks in which algorithms
> > differ a lot, SBCL is almost as fast as C.
>
> agreed. fannkuch is 27x slower, and chameneos requires 120x more
> memory. i'm not contesting anything besides 'fannkuch' for speed. i'm
> not really concerned about 'startup' because it's amortized.

A good measure of how well is SBCL doing is disassembling:

(disassemble 'jj)

; 1160AD7A: BA27000008 MOV EDX, 134217767 ;
no-arg-parsing entry point
; 7F: 8B3D14AD6011 MOV EDI, [#x1160AD14] ; "hello"
; 85: 8B0518AD6011 MOV EAX, [#x1160AD18] ;
#<FDEFINITION object for FORMAT>
; 8B: B908000000 MOV ECX, 8
; 90: FF75F8 PUSH DWORD PTR [EBP-8]
; 93: FF6005 JMP DWORD PTR [EAX+5]
; 96: 90 NOP
; 97: 90 NOP
; 98: 0F0B0A BREAK 10 ; error
trap
; 9B: 02 BYTE #X02
; 9C: 18 BYTE #X18 ;
INVALID-ARG-COUNT-ERROR
; 9D: 4D BYTE #X4D ; ECX
; 9E: 0F0B0A BREAK 10 ; error
trap
; A1: 02 BYTE #X02
; A2: 18 BYTE #X18 ;
INVALID-ARG-COUNT-ERROR
; A3: 4D BYTE #X4D ; ECX

If you understand some asemssembler you can see that SBCL is actually
very clever compiling things. :-)
(For example, it is using a jump in line 93 instead of function call to
avoid some overhead. Not all C compilers are able to do so.)

From: Wade Humeniuk on 30 Sep 2006 03:12

Speaking of which. By modifying the original somewhat (I do not have SBCL)
I do not have the patience to code it like C, so....

(defvar *fannkuch-prints* nil)
(defvar *fannkuch-max-flips* 0)

(declaim (inline flip make-fannkuch init-fannkuch print-fannkuch
copy-fannkuch permute-fannkuch count-flips))

(defun make-fannkuch (n)
(make-array n :element-type '(unsigned-byte 8)))

(defun init-fannkuch (fannkuch end)
(loop for n from 1
for i from 0 below end
do (setf (aref fannkuch i) n))
fannkuch)

(defun copy-fannkuch (from to start end)
(declare (type (simple-array (unsigned-byte 8) (*)) from to)
(optimize (speed 3) (safety 0) (debug 0) #+lispworks(fixnum-safety 0)))
(loop for i from start below end
do (setf (aref to i) (aref from i))))

(defun permute-fannkuch (fannkuch n)
(declare (type (simple-array (unsigned-byte 8) (*)) fannkuch)
(optimize (speed 3) (safety 0) (debug 0) #+lispworks(fixnum-safety 0)))
(let ((first (aref fannkuch 0)))
(loop for i from 0 below (1- n)
do (setf (aref fannkuch i) (aref fannkuch (1+ i))))
(setf (aref fannkuch (1- n)) first)
fannkuch))

(defun print-fannkuch (permutation)
(declare (type (simple-array (unsigned-byte 8) (*)) permutation)
(optimize (speed 3) (safety 0) (debug 0) #+lispworks(fixnum-safety 0)))
(when (< *fannkuch-prints* 30)
(loop for i across permutation do
(princ i))
(terpri)
(incf *fannkuch-prints*)))

(defun flip (permutation)
(declare (type (simple-array (unsigned-byte 8) (*)) permutation)
(optimize (speed 3) (safety 0) (debug 0) #+lispworks(fixnum-safety 0)))
(let ((n (aref permutation 0)))
(when (> n 1)
(loop for start from 0
for end downfrom (1- n)
while (< start end) do
(rotatef (aref permutation start)
(aref permutation end))))))

(defun count-flips (permutation)
(declare (type (simple-array (unsigned-byte 8) (*)) permutation)
(optimize (speed 3) (safety 0) (debug 0) #+lispworks(fixnum-safety 0)))
(let ((flips 0))
(loop until (= (aref permutation 0) 1)
do (flip permutation)
(incf flips))
(setf *fannkuch-max-flips* (max flips *fannkuch-max-flips*))
flips))

(defun fanndance (fannkuch len &aux (first (first fannkuch)) (next (second fannkuch)))
(declare (type (simple-array (unsigned-byte 8) (*)) first next)
(type fixnum len)
(optimize (speed 3) (safety 0) (debug 0) #+lispworks(fixnum-safety 0)))
(copy-fannkuch first next 0 (length first))
(if (= len 1)
(progn (count-flips next) (print-fannkuch first))
(progn
(dotimes (i (1- len))
(fanndance (cdr fannkuch) (1- len))
(permute-fannkuch next len))
(fanndance (cdr fannkuch) (1- len)))))

(defun fannkuch (n)
(let ((*fannkuch-prints* 0) (*fannkuch-max-flips* 0)
(fannkuch (loop repeat (1+ n) collect (make-fannkuch n))))
(init-fannkuch (car fannkuch) n)
(format t "~A"
(with-output-to-string (*standard-output*)
(fanndance fannkuch n)))
(format t "Pfannkuchen(~D) = ~D" n *fannkuch-max-flips*)))

#+ignore(defun main ()
(fannkuch (parse-integer (second *posix-argv*))))

CL-USER 49 > (time (fannkuch 10))
Timing the evaluation of (FANNKUCH 10)
12345678910
21345678910
23145678910
32145678910
31245678910
13245678910
23415678910
32415678910
34215678910
43215678910
42315678910
24315678910
34125678910
43125678910
41325678910
14325678910
13425678910
31425678910
41235678910
14235678910
12435678910
21435678910
24135678910
42135678910
23451678910
32451678910
34251678910
43251678910
42351678910
24351678910
Pfannkuchen(10) = 38
user time = 2.874
system time = 0.000
Elapsed time = 0:00:03
Allocation = 3792 bytes standard / 4554 bytes conses
0 Page faults
Calls to %EVAL 34
NIL

CL-USER 50 >

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Prev: Pocket Lisp Machine
Next: Next Generation of Language