From: Helmut Eller on 11 Mar 2010 06:43 * Nicolas Neuss [2010-03-11 11:02+0100] writes: > But before starting premature optimization we should really have > "test.txt" so that we can compare the output and the CPU time with what > was written before: I just put abcdefghijklmnopqrstuvwxyz in the file (no trailing newline) and got these timings: SBCL 1.0.35.8 : 81 seconds Allegro 8.2 : 194 seconds Gforth 0.7.0 : 29 seconds iForth 4.0.11 : 6 seconds Gforth is an interpreter and iForth is a native compiler. All running on a 32bit x86. Helmut
From: Nicolas Neuss on 11 Mar 2010 09:55 Helmut Eller <eller.helmut(a)gmail.com> writes: > * Nicolas Neuss [2010-03-11 11:02+0100] writes: > >> But before starting premature optimization we should really have >> "test.txt" so that we can compare the output and the CPU time with what >> was written before: > > I just put abcdefghijklmnopqrstuvwxyz in the file (no trailing newline) > and got these timings: > > SBCL 1.0.35.8 : 81 seconds > Allegro 8.2 : 194 seconds > > Gforth 0.7.0 : 29 seconds > iForth 4.0.11 : 6 seconds > > Gforth is an interpreter and iForth is a native compiler. All running > on a 32bit x86. > > Helmut OK, I have invested some time and inserted some type and optimization declarations. I have reduced the runtime by more than a factor 4 on my machine (from 67.5 seconds to about 15.4 seconds). The largest remaining problem is probably in (defconstant +rnd-unity+ 4294967291) (defconstant +rnd-mult+ 3961633963) (let ((seed 0)) (declare (type (unsigned-byte 32) seed)) (defun set-seed (s) (setf seed s)) (defun prng () (declare (values (integer 0 #.+rnd-unity+)) (optimize speed)) (setf seed (mod (* +rnd-mult+ seed) +rnd-unity+))) (defun rnd-byte () (declare (values (integer 0 255)) (optimize speed)) (setf seed (mod (* +rnd-mult+ seed) +rnd-unity+)) (ldb (byte 8 24) seed)) ) Here, I had to inline the call to (prng) in (rnd-byte) by hand because SBCL refused to inline here ("hairy lexical environment"), and also (rnd-byte) cannot be inlined because of the same reason. Also, there is still an optimization note for the multiplication. Attached is my current version. Nicolas (defpackage :lc53 (:use cl)) (in-package :lc53) (declaim (optimize debug)) (defconstant +rnd-unity+ 4294967291) (defconstant +rnd-mult+ 3961633963) (let ((seed 0)) (declare (type (unsigned-byte 32) seed)) (defun set-seed (s) (setf seed s)) (defun prng () (declare (values (integer 0 #.+rnd-unity+)) (optimize speed)) (setf seed (mod (* +rnd-mult+ seed) +rnd-unity+))) (defun rnd-byte () (declare (values (integer 0 255)) (optimize speed)) (setf seed (mod (* +rnd-mult+ seed) +rnd-unity+)) (ldb (byte 8 24) seed)) ) (defmacro with-seed ((s) &body body) `(let () (set-seed ,s) ,@body)) (assert (with-seed (10) (= (prng) 961634011))) (defun set-passkey () (set-seed 123456789)) (assert (with-seed (10) (= (rnd-byte)) 57)) (defun crypt (text) (loop for byte across text and i from 0 do (setf (aref text i) (logxor byte (rnd-byte)))) text) (assert (with-seed (10) (equal (map 'string #'code-char (crypt (map 'vector #'char-code "abc"))) "XSn"))) (defconstant +word+ (ash 1 31)) (defconstant +byte+ (ash 1 7)) (defun string-to-simple-array (string) (map '(simple-array (unsigned-byte 8) (*)) #'char-code string)) (defvar *sample* (string-to-simple-array "abc")) (defun how-far (seed) (declare (optimize speed)) (with-seed (seed) (let ((count 0) (text *sample*)) (declare (type (simple-array (unsigned-byte 8) (*)) text)) (declare (type (integer 0 #.(1- most-positive-fixnum)) count)) (loop for byte across text do (cond ((zerop (logand (logxor byte (rnd-byte)) +byte+)) (incf count)) (t (return count))))))) (assert (let ((*sample* (string-to-simple-array "abc"))) (null (how-far 10)))) (defun <find-seed> (upper-count seed mask) (declare (optimize speed) (type (integer 0 #.+word+) mask) (type (unsigned-byte 32) seed) (type fixnum upper-count)) (labels ((last-mask () mask) (next-mask () (ash mask -1)) (next-seed () (logior seed mask)) (good-seed () (next-seed)) (last-seed () seed)) (declare (inline last-mask next-mask next-seed good-seed last-seed)) (unless (zerop (last-mask)) (let ((lower-count (how-far (next-seed)))) (declare (type (or null fixnum) lower-count)) (cond ((not lower-count) (throw 'found (good-seed))) ((< upper-count lower-count) (<find-seed> lower-count (next-seed) (next-mask)) (<find-seed> upper-count (last-seed) (next-mask))) (t (<find-seed> upper-count (last-seed) (next-mask)) (<find-seed> lower-count (next-seed) (next-mask)))))))) (defun find-seed () (catch 'found (<find-seed> 0 0 +word+))) (defun test-file () "test.txt") (defun load-file () (with-open-file (in (test-file) :element-type '(unsigned-byte 8)) (let ((text (make-array (file-length in) :element-type '(unsigned-byte 8)))) (read-sequence text in) (setq *sample* text)))) (defun show-file () (loop for byte across *sample* do ;;(format t " ~2,'0x" 10) (write-char (code-char byte)))) (defun crypt-file () (set-passkey) (crypt *sample*)) (defun test () (load-file) (show-file) (terpri) (crypt-file) (set-seed 0) (format t "~%SAMPLE is encrypted and seed is zeroed; commencing cracking...") (let ((seed (time (find-seed)))) (cond (seed (set-seed seed) (crypt *sample*) (format t "~%The seed is: ~d~%The file is: ~%" seed) (show-file) (terpri)) (t (format t "~%No valid seed found~%"))))) ;; (test)
From: Nicolas Neuss on 11 Mar 2010 10:18 Helmut Eller <eller.helmut(a)gmail.com> writes: > * Nicolas Neuss [2010-03-11 11:02+0100] writes: > >> But before starting premature optimization we should really have >> "test.txt" so that we can compare the output and the CPU time with what >> was written before: > > I just put abcdefghijklmnopqrstuvwxyz in the file (no trailing newline) > and got these timings: > > SBCL 1.0.35.8 : 81 seconds > Allegro 8.2 : 194 seconds > > Gforth 0.7.0 : 29 seconds > iForth 4.0.11 : 6 seconds > > Gforth is an interpreter and iForth is a native compiler. All running > on a 32bit x86. > > Helmut Addendum: I have used a 64bit AMD machine. And another remark relating to your measurements above: I once had the pleasure to test 64-bit Allegro CL, and it turned out to be a lot faster than the 32-bit version - IIRC, it was about the same speed as SBCL on the for floating-point-intensive calculations. So, I would be interested in how the 64bit-Allegro performs for this test. Nicolas
From: Tamas K Papp on 11 Mar 2010 10:51 On Thu, 11 Mar 2010 16:18:44 +0100, Nicolas Neuss wrote: > Helmut Eller <eller.helmut(a)gmail.com> writes: > >> * Nicolas Neuss [2010-03-11 11:02+0100] writes: >> >>> But before starting premature optimization we should really have >>> "test.txt" so that we can compare the output and the CPU time with >>> what was written before: >> >> I just put abcdefghijklmnopqrstuvwxyz in the file (no trailing newline) >> and got these timings: >> >> SBCL 1.0.35.8 : 81 seconds >> Allegro 8.2 : 194 seconds >> >> Gforth 0.7.0 : 29 seconds >> iForth 4.0.11 : 6 seconds >> >> Gforth is an interpreter and iForth is a native compiler. All running >> on a 32bit x86. >> >> Helmut > > Addendum: I have used a 64bit AMD machine. > > And another remark relating to your measurements above: I once had the > pleasure to test 64-bit Allegro CL, and it turned out to be a lot faster > than the 32-bit version - IIRC, it was about the same speed as SBCL on > the for floating-point-intensive calculations. So, I would be > interested in how the 64bit-Allegro performs for this test. Hi Nicolas, If you are in the benchmarking mood, would you try this on Clozure CL? I keep hearing good things about it. Just throwing the code as it is (ie with declarations you used for SBCL) at it would be informative. Tamas
From: Slobodan Blazeski on 11 Mar 2010 14:49
On Mar 11, 4:18 pm, Nicolas Neuss <lastn...(a)kit.edu> wrote: > Helmut Eller <eller.hel...(a)gmail.com> writes: > > * Nicolas Neuss [2010-03-11 11:02+0100] writes: > > >> But before starting premature optimization we should really have > >> "test.txt" so that we can compare the output and the CPU time with what > >> was written before: > > > I just put abcdefghijklmnopqrstuvwxyz in the file (no trailing newline) > > and got these timings: > > > SBCL 1.0.35.8 : 81 seconds > > Allegro 8.2 : 194 seconds > > > Gforth 0.7.0 : 29 seconds > > iForth 4.0.11 : 6 seconds > > > Gforth is an interpreter and iForth is a native compiler. All running > > on a 32bit x86. > > > Helmut > > Addendum: I have used a 64bit AMD machine. > > And another remark relating to your measurements above: I once had the > pleasure to test 64-bit Allegro CL, and it turned out to be a lot faster > than the 32-bit version - IIRC, it was about the same speed as SBCL on > the for floating-point-intensive calculations. So, I would be > interested in how the 64bit-Allegro performs for this test. > > Nicolas My machine shows 64 s for the Helmut code and 21 s for your version. I've installed VfxForth and gForth but I don't know how to load forth file with them nor how time them. Anyway this problem is far too much bit twiddling for my taste so I'll have to pass. big thanks to Helmut Eller again Slobodan |