From: mpbro on 18 Dec 2007 12:15 I'm setting up the requisite buildsystem to benchmark a variety of FFT and compiler options. Specific to the purpose of this post, I'm testing 3D, single precision, real-to-complex-even FFTW (sfftw_plan_dft_r2c_3d). I'm comparing ifort to gfortran, and single- threaded to multi-threaded. I'm running an Intel quad-core x86_64 machine with Fedora Core 7. To make a long story short: in gfortran, any use of FFTW's thread manipulation functions (sfftw_init_threads, sfftw_plan_with_nthreads, sfftw_cleanup_threads) causes a segmentation fault. First let me show the main program: !------------------------------------------------------------------------------- program FFTW_Test use system_time_mod implicit none include 'fftw3.f' integer :: i, j, k, n1, n2, n3, nthreads, stat integer*8 :: planf, planb type(timer) :: t character(len=100) :: ffttype real, dimension(:,:,:), allocatable :: data #ifdef FFTW ffttype = 'FFTW' #elif MKL ffttype = 'MKL' #else ffttype = 'UNKNOWN' #endif n1=750 n2=750 n3=750 call system_time_init() !----------------------------------------------------------------------------- ! 3D in-place, single-threaded, real-to-complex FFT !----------------------------------------------------------------------------- write(0,*) '=================================================================' write(0,*) '3D in-place, single-threaded, real-to-complex FFT with ',ffttype write(0,*) 'Array size=',n1,n2,n3 allocate( data(2*(n1/2+1),n2,n3) ) do i=1,n1 do j=1,n2 do k=1,n3 data(i,j,k) = 1.0*(i+j+k) end do end do end do write(0,*) ' data(25:27,25,25) ',data(25,25,25),data(26,25,25),& data(27,25,25) #ifdef MULTI nthreads = 4 #else nthreads = 1 #endif write(0,*) 'nthreads=',nthreads #ifdef FFTW write(0,*) "Using FFTW multi-threading" call sfftw_init_threads(stat) call sfftw_plan_with_nthreads(nthreads) #elif MKL write(0,*) "Using MKL multi-threading" call mkl_set_num_threads(nthreads) #endif call sfftw_plan_dft_r2c_3d(planf, n1, n2, n3, data, data, FFTW_ESTIMATE); call sfftw_plan_dft_c2r_3d(planb, n1, n2, n3, data, data, FFTW_ESTIMATE); call start_timer( t ) call sfftw_execute(planf) call sfftw_execute(planb) call stop_timer( t ) write(0,*) 'FFT^{-1}[FFT[data(25:27,25,25)]]',data(25,25,25)/ (n1*n2*n3), & data(26,25,25)/ (n1*n2*n3), & data(27,25,25)/ (n1*n2*n3) write(0,*) 'elapsed time:',t%telapsed call sfftw_destroy_plan(planf) call sfftw_destroy_plan(planb) #ifdef FFTW call sfftw_cleanup_threads() #endif deallocate( data ) call exit(0) end program FFTW_Test !------------------------------------------------------------------------------- My apologies for the mangled whitespace and proliferation of preprocessor directives. The system_time_mod module is simply a (compiler-dependent) wrapper around system_clock(). I can include the source code if you are interested. I use the fpp preprocessor with ifort and the -x f95-cpp-input preprocessor option with gfortran. Here is how I build the executable using ifort: ---------------------------- fpp -Difort ../../Src/system_time.f90 > fpp_system_time.f90 ifort -c -assume bscc -assume byterecl -fpp -mtune=pentium4 -O3 - static-intel -vms -w -WB fpp_system_time.f90 -o system_time.o fpp -DFFTW -DMULTI -I/usr/local/include FFTW_Test.f90 > fpp_FFTW_Test.f90 ifort -c -assume bscc -assume byterecl -fpp -mtune=pentium4 -O3 - static-intel -vms -w -WB -I/usr/local/include fpp_FFTW_Test.f90 -o FFTW_Test.o ifort -assume bscc -assume byterecl -fpp -mtune=pentium4 -O3 -static- intel -vms -w -WB -I/usr/local/include FFTW_Test.o system_time.o -L/ usr/local/lib -lfftw3f -lfftw3 -lfftw3f_threads -lpthread -lm -o FFTW_native_multithreaded_ifort ---------------------------- And here is how I build the executable using gfortran: ---------------------------- gfortran -E -x f95-cpp-input -Dgfortran ../../Src/system_time.f90 > fpp_system_time.f90 gfortran -c -O2 -static -m64 -w fpp_system_time.f90 -o system_time.o gfortran -E -x f95-cpp-input -DFFTW -DMULTI -I/usr/local/include FFTW_Test.f90 > fpp_FFTW_Test.f90 gfortran -c -O2 -static -m64 -w -I/usr/local/include fpp_FFTW_Test.f90 -o FFTW_Test.o gfortran -O2 -static -m64 -w -I/usr/local/include FFTW_Test.o system_time.o -L/usr/local/lib -lfftw3f -lfftw3 -lfftw3f_threads - lpthread -lm -o FFTW_native_multithreaded_gfortran ---------------------------- The ifort version works as expected, in both single-threaded and multi- threaded mode. The gfortran version works only if I comment out the three lines of FFTW thread manipulation code. Would appreciate any insights. Please let me know if I have provided sufficient information to recognize/diagnose the problem. Regards, Morgan
|
Pages: 1 Prev: Kind of new to cray pointers Next: gfortran and ifort gave the different result. |