Prev: emacs lisp function for dealing with lines
Next: qooxlisp version 2.0 [was Re: qooxlisp live - 3MB bloatware]
From: Mirko on 8 Jun 2010 10:44 Hello, I am diving into parallel processing in that other ancient language (fortran), and I was running one of intel's sample OpenMP problems which calculates the number of primes, 4n+1 and 4n+3 primes below 10^7. So, I thought, why not transcribe it into lisp, and see how times compare. This post is about the following: - I was not able to eliminate one optimization warning - Can my declarations be improved - Comparison of fortran and cl run times. Because of licensing issues, I will not post intel's fortran code. But I tried to copy the algorithm and the code as verbatim as possible. I am running this on a recent sbcl on a 64-bit redhat linux. The transcribed code is here: http://paste.lisp.org/display/111258. With maximum optimization, I am getting warnings in the following code (elided here) (declare (fixnum index)) (floor (sqrt (coerce index 'float))) Error message: forced to do full call unable to do inline float truncate (cost 5) because: The result is a (VALUES UNSIGNED-BYTE &OPTIONAL), not a (VALUES (SIGNED- BYTE 32) &REST T). unable to do inline float truncate (cost 5) because: The result is a (VALUES UNSIGNED-BYTE &OPTIONAL), not a (VALUES (UNSIGNED-BYTE 32) &REST T). forced to do full call unable to do inline float coercion (cost 5) because: The first argument is a UNSIGNED-BYTE, not a (SIGNED-BYTE 32). unable to do inline float coercion (cost 6) because: The first argument is a UNSIGNED-BYTE, not a (UNSIGNED-BYTE 32). doing float to pointer coercion (cost 13) Is there some way to eliminate that? Second question, can the declarations be improved since all the integers are positive. And finally here are the results. First ifort: Range to check for Primes: 1 10000000 We are using 1 thread(s) Number of primes found: 664579 Number of 4n+1 primes found: 332181 Number of 4n-1 primes found: 332398 Run time: 6.390028 Now sbcl: Range to check for Primes: 1, 10000000 Number of primes found: 664579 Number of 4n+1 primes found:332181 Number of 4n-1 primes found:332398 Evaluation took: 7.940 seconds of real time 7.938793 seconds of total run time (7.926795 user, 0.011998 system) [ Run times consist of 0.022 seconds GC time, and 7.917 seconds non- GC time. ] 99.99% CPU 23,760,484,821 processor cycles 80,382,896 bytes consed So we have a 25% difference. Not bad. Thanks, Mirko
From: Raymond Toy on 8 Jun 2010 14:48 On 6/8/10 10:44 AM, Mirko wrote: > Hello, > > I am diving into parallel processing in that other ancient language > (fortran), and I was running one of intel's sample OpenMP problems > which calculates the number of primes, 4n+1 and 4n+3 primes below > 10^7. So, I thought, why not transcribe it into lisp, and see how > times compare. > > This post is about the following: > - I was not able to eliminate one optimization warning > - Can my declarations be improved > - Comparison of fortran and cl run times. > > Because of licensing issues, I will not post intel's fortran code. > But I tried to copy the algorithm and the code as verbatim as > possible. > > I am running this on a recent sbcl on a 64-bit redhat linux. The > transcribed code is here: http://paste.lisp.org/display/111258. > > With maximum optimization, I am getting warnings in the following code > (elided here) > (declare (fixnum index)) > (floor (sqrt (coerce index 'float))) Since this is from fortran, presumably index is actually positive and is probably a 32 bit or 64 bit integer. Declare it as such: (declare (type (unsigned-byte 32) index)) Note that converting a 32-bit integer to a float loses bits, but if the fortran code did that, then perhaps it's ok. With this declaration, the compiler should know that floor can only return a 32-bit (or 64-bit) integer, so I think the warnings should be gone. > > Second question, can the declarations be improved since all the > integers are positive. The declaration above is an example of declaring a non-negative (and bounded) integer. Ray
From: Mirko on 8 Jun 2010 16:09 On Jun 8, 2:48 pm, Raymond Toy <toy.raym...(a)gmail.com> wrote: > On 6/8/10 10:44 AM, Mirko wrote: > > > > > Hello, > > > I am diving into parallel processing in that other ancient language > > (fortran), and I was running one of intel's sample OpenMP problems > > which calculates the number of primes, 4n+1 and 4n+3 primes below > > 10^7. So, I thought, why not transcribe it into lisp, and see how > > times compare. > > > This post is about the following: > > - I was not able to eliminate one optimization warning > > - Can my declarations be improved > > - Comparison of fortran and cl run times. > > > Because of licensing issues, I will not post intel's fortran code. > > But I tried to copy the algorithm and the code as verbatim as > > possible. > > > I am running this on a recent sbcl on a 64-bit redhat linux. The > > transcribed code is here:http://paste.lisp.org/display/111258. > > > With maximum optimization, I am getting warnings in the following code > > (elided here) > > (declare (fixnum index)) > > (floor (sqrt (coerce index 'float))) > > Since this is from fortran, presumably index is actually positive and is > probably a 32 bit or 64 bit integer. Declare it as such: > > (declare (type (unsigned-byte 32) index)) > > Note that converting a 32-bit integer to a float loses bits, but if the > fortran code did that, then perhaps it's ok. With this declaration, the > compiler should know that floor can only return a 32-bit (or 64-bit) > integer, so I think the warnings should be gone. > > > > > Second question, can the declarations be improved since all the > > integers are positive. > > The declaration above is an example of declaring a non-negative (and > bounded) integer. > > Ray Thanks Ray. The change did not remove the above-mentioned compiler warnings, but it did bring down the execution time: It went from 7.94 to 6.62 seconds, which is about 3.5% slower than the ifort. (Today my brain is just not up to diving into hyperspec and divining the meaning of the warning and how to remove it). BTW, the above numbers for the fortran executable are with the -fast compilation option for ifort. Profiling of the fortran code (using vtune) shows that most of the time is spent in the `mod' function: (if (zerop (mod index factor)) ...) which is the sieve for the primes. I did not profile the lisp code. Mirko
From: Raymond Toy on 8 Jun 2010 17:48 On 6/8/10 4:09 PM, Mirko wrote: > On Jun 8, 2:48 pm, Raymond Toy <toy.raym...(a)gmail.com> wrote: >> On 6/8/10 10:44 AM, Mirko wrote: > Thanks Ray. > > The change did not remove the above-mentioned compiler warnings, but Bummer. How about just doing (declare (type (unsigned-byte 32) x)) ... (floor (sqrt x)) That produces no warnings for me with cmucl. Ray
From: Mirko on 9 Jun 2010 09:19 On Jun 8, 5:48 pm, Raymond Toy <toy.raym...(a)gmail.com> wrote: > On 6/8/10 4:09 PM, Mirko wrote: > > > On Jun 8, 2:48 pm, Raymond Toy <toy.raym...(a)gmail.com> wrote: > >> On 6/8/10 10:44 AM, Mirko wrote: > > Thanks Ray. > > > The change did not remove the above-mentioned compiler warnings, but > > Bummer. How about just doing > > (declare (type (unsigned-byte 32) x)) > ... > (floor (sqrt x)) > > That produces no warnings for me with cmucl. > > Ray I still got warning. But I also discovered that I could remove `coerce' from (floor (sqrt (coerce index 'float))). Declaring index as (unsigned-byte 32), (floor (sqrt index)) works fine. The run-time is reduced by about 0.07 seconds. But with that change I get another warning. The (floor (sqrt index)) is embedded in a do loop: (do ((index start (+ index 2)) ... The compiler warns me: doing unsigned word to integer conversion to INDEX. This is probably due to the `(+ index 2). Mirko
|
Next
|
Last
Pages: 1 2 Prev: emacs lisp function for dealing with lines Next: qooxlisp version 2.0 [was Re: qooxlisp live - 3MB bloatware] |