Prev: nlinfit
Next: Content Based Image Retrieval
From: Christopher Hulbert on 15 Jan 2007 06:32 Brian R. wrote: > Thanks for your help. I tried your test and the test also worked > correctly on my system. > > Regarding versions, on the 64-bit side: > Matlab = Version 7.1.0.183 (R14) Service Pack 3 > PGF = pgf90 6.1-6 64-bit target on x86-64 Linux > > I am aware of the pointer issue and I believe (although perhaps I am > wrong) I have fixed all of these in the code (the code was failing in > other manners on the 64-bit side before the pointers were corrected). > > > The compiler options you used differed from what I was using > significantly (many of my compile options were in the mexopts.sh). I > tried using the same options as I used on my application on the test > file you created and the error occurs with the test file. Namely I > did the following: > > pgf90 -c -I../include -I/usr/global/matlab.research/extern/include > -fPIC -Kieee -g test_mex.f90 > > pgf90 -g -mp -shared > -Wl,--version-script,/usr/global/matlab.research/extern/lib/glnxa64/fe > xport.map -o test_mex.mexa64 test_mex.o > /usr/global/matlab.research/extern/lib/glnxa64/version4.o > -Wl,-rpath-link,/usr/global/matlab.research/bin/glnxa64 > -L/usr/global/matlab.research/bin/glnxa64 -lmx -lmex -lmat -lm > -lstdc++ > > matlab -nosplash -nodesktop -r "test_mex;quit" > > and got: > PGFIO-F-254/formatted write/internal file/illegal repeat count in > format. > In source file test_mex.f90, at line number 12 I could reproduce this problem. Try starting with the options I had, and start adding one at a time to see where the problem occurs. I would not use OpenMP. I have had problems with PGI openmp and matlab. They only way I've gotten it to work is to not set the environment variable and explicitly call omp_set_num_threads from the compiled code. Also, try not linking with the Matlab libraries and stdc++. > > I found that removing the -Wl,--version-script,....fexport.map option > fixes the problem with the test file but unfortunately removing this > option from the compilation of my application does not fix the > problem. > > I altered the mexopts.sh file and made other changes in an attempt to > minimize the arguments passed to the compiler but this does not fix > the problem. > > The routine in question is now compiled as: > pgf90 -c -I../../include -byteswapio -fpic rrtm_init.F > > ar ru ../../mm5libs/librad.a swrad.o lwrad.o mm5atm.o rrtm.o > rrtm_gasabs.o rrtm_init.o rrtm_k_g.o rrtm_rtrn.o rrtm_setcoef.o > rrtm_taumol.o inirad.o o3data.o dm_io.o > > And then the creation of the .mexa64 includes statements such as: > pgf90 -c -I../include -fPIC -Kieee ../src/util/matlabdriver.F > pgf90 -c -I../include -fPIC -Kieee ../src/util/driver.F > pgf90 -shared -o radiationmatlab.mexa64 matlabdriver.o driver.o > libutil.a librad.a > > With these options for the compilation of the various parts of the > program the error still occurs. However, the problem occurs > intermittently and thus far I cannot determine why. The program > calls various .mexa64 files (that all use the same mexFunction > subroutine) and now some of the them will allow WRITE statements but > others will not. Additionally, I can run the one .mexa64 repeatedly > and the WRITE statement will work every time but if I call it and > then call an identical .mexa64 of a different name with the same > inputs the WRITE statement will again cause the "illegal repeat > count" error. PGI does not have Matlab, so they cannot reproduce this. Can you make this occur in standalone code? You could compile the mex function as a shared library, and then just write standalone programs that make the repeated calls. Note that in this case, you can't use any of the mx or mex routines. > > Following is a simplified version of my mexFunction file. In the > actual file every variable with a 1 (one) in it also occurs with 2, > 3, 4, and 5 and there are identical statements for each of these. > Also the actual file calls another Fortran routine to do the actual > work but I've found that even with this call removed the problems > still occur and so I've omitted the call for simplicity. Perhaps > there is something I'm doing that is obviously incorrect. > > subroutine mexFunction(nlhs, plhs, nrhs, prhs) > #include <fintrf.h> > MWPOINTER plhs(*), prhs(*) > MWPOINTER mxGetPr, mxCreateDoubleMatrix > MWPOINTER x1_pr, y1_pr > integer nlhs, nrhs > integer mxGetM, mxGetN, mxIsNumeric > integer mmrhs1, nnrhs1, sizerhs1 > integer mmlhs1, nnlhs1, sizelhs1 > integer maxxy > parameter (maxxy=4000) > real*8 x1(maxxy), y1(maxxy) > CHARACTER (LEN=80) debugout > > write(debugout,'A11'),'Hello World' > call mexprintf(debugout) > > mmrhs1 = mxGetM(prhs(1)) > nnrhs1 = mxGetN(prhs(1)) > sizerhs1 = mmrhs1*nnrhs1 > > mmlhs1=mmrhs1 > nnlhs1=nnrhs1 > sizelhs1=mmlhs1*nnlhs1 > if(mxIsNumeric(prhs(1)) .eq. 0) then > call mexErrMsgTxt('Input 1 must be a number.') > endif > > plhs(1) = mxCreateDoubleMatrix(mmlhs1,nnlhs1,0) > y1_pr = mxGetPr(plhs(1)) > x1_pr = mxGetPr(prhs(1)) > call mxCopyPtrToReal8(x1_pr,x1,sizerhs1) > call mxCopyReal8ToPtr(y1,y1_pr,sizelhs1) > > return > end > > Any ideas on what I might be doing wrong would be greatly > appreciated. > > Thanks, > Brian R.
From: B. Reen on 15 Jan 2007 11:41 Christopher Hulbert wrote: >I could reproduce this problem. Try starting with the options I had, and start >adding one at a time to see where the problem occurs. I would not use OpenMP. I >have had problems with PGI openmp and matlab. They only way I've gotten it to >work is to not set the environment variable and explicitly call >omp_set_num_threads from the compiled code. Also, try not linking with the >Matlab libraries and stdc++. As noted I found that my option: -Wl,--version-script,/usr/global/matlab.research/extern/lib/glnxa64/fe xport.map was the one causing the problem to occur with your test code. However removing this compile option did not resolve the problem with my code. I also eliminated almost all compiler options used in my code and the problem still occurs with my code. As noted, I can reproduce my problem using your example code and compiler options, which seems to be an easier way to attack the problem. I may not have been clear on this in my previous posts so I'll restate it. I can run your code without problem, but if I copy the .mexa64 file so that I have test_mex.mexa64 and test_mex2.mexa64, which are identical, I have the problem with whichever file is called second. If I issue a clear command for the first mex file before running the second file the problem does not occur. So here's what I see: [reenb(a)jupiter mm5libs]$ cat test_mex.f90 subroutine mexfunction(nlhs,plhs,nrhs,prhs) integer :: nlhs,nrhs,err integer(8) :: plhs(nlhs),prhs(nrhs) character(len=80) :: debugout interface integer function mexprintf(str) character(*) :: str end function end interface write(debugout,'A17'),'Reading RRTM_DATA' err = mexprintf(debugout) end subroutine [reenb(a)jupiter mm5libs]$ pgf90 -V pgf90 6.1-6 64-bit target on x86-64 Linux Copyright 1989-2000, The Portland Group, Inc. All Rights Reserved. Copyright 2000-2006, STMicroelectronics, Inc. All Rights Reserved. [reenb(a)jupiter mm5libs]$ pgf90 -shared -otest_mex.mexa64 test_mex.f90 [reenb(a)jupiter mm5libs]$ cp test_mex.mexa64 test_mex2.mexa64 [reenb(a)jupiter mm5libs]$ matlab -nodesktop -nosplash < M A T L A B > Copyright 1984-2005 The MathWorks, Inc. Version 7.1.0.183 (R14) Service Pack 3 August 02, 2005 To get started, type one of these: helpwin, helpdesk, or demo. For product information, visit www.mathworks.com. >> test_mex Reading RRTM_DATA>> test_mex Reading RRTM_DATA>> clear test_mex >> test_mex2 Reading RRTM_DATA>> test_mex2 Reading RRTM_DATA>> test_mex PGFIO-F-254/formatted write/internal file/illegal repeat count in format. In source file test_mex.f90, at line number 12 [reenb(a)jupiter mm5libs]$ This problem seems to be the basis for my code's problem. Namely my application has repeated calls to .mexa64 files with identical mexfunction's (differences in what is run is based on variables passed to mexfunction that are subsequently passed to a subroutine that calls different subroutines based on input parameters). Also, I find that compiling with Intel's Fortran compiler on the same 64-bit machine resolves the issue with the test file (after a slight alteration to the format statement which makes no difference for PGF90). I have not yet tried to use the Intel compiler for my code, but I anticipate that it may not be as straightforward. For Intel Fortran I compile with: ifort -shared -fpic -otest_mex.mexa64 test_mexi.f90 The version is: Intel(R) Fortran Compiler for Intel(R) EM64T-based applications, Version 9.0 Build 20050430 Package ID: l_fc_p_9.0.021 The problem does not occur on the 32-bit side, however the version of Matlab available to me on the 32-bit side is different: pgf90 6.1-6 32-bit target on x86 Linux Matlab version: Version 7.2.0.294 (R2006a) >PGI does not have Matlab, so they cannot reproduce this. Can you make this occur >in standalone code? You could compile the mex function as a shared library, and >then just write standalone programs that make the repeated calls. Note that in >this case, you can't use any of the mx or mex routines. I have not been able to make this occur in standalone code. I wrote a driver program to call the test program and executed the two copies of this without problem. Thanks, B. Reen
From: Christopher Hulbert on 15 Jan 2007 12:13
* Snip fortran mex problems with PGI compilers * I was able to reproduce the problem outside of Matlab. I have posted the example code to PGI on their forum. I would continue to follow the post using the link below. http://www.pgroup.com/userforum/viewtopic.php?t=769 |