From: Christopher Hulbert on
Brian R. wrote:
> Thanks for your help. I tried your test and the test also worked
> correctly on my system.
>
> Regarding versions, on the 64-bit side:
> Matlab = Version 7.1.0.183 (R14) Service Pack 3
> PGF = pgf90 6.1-6 64-bit target on x86-64 Linux
>
> I am aware of the pointer issue and I believe (although perhaps I am
> wrong) I have fixed all of these in the code (the code was failing in
> other manners on the 64-bit side before the pointers were corrected).
>
>
> The compiler options you used differed from what I was using
> significantly (many of my compile options were in the mexopts.sh). I
> tried using the same options as I used on my application on the test
> file you created and the error occurs with the test file. Namely I
> did the following:
>
> pgf90 -c -I../include -I/usr/global/matlab.research/extern/include
> -fPIC -Kieee -g test_mex.f90
>
> pgf90 -g -mp -shared
> -Wl,--version-script,/usr/global/matlab.research/extern/lib/glnxa64/fe
> xport.map -o test_mex.mexa64 test_mex.o
> /usr/global/matlab.research/extern/lib/glnxa64/version4.o
> -Wl,-rpath-link,/usr/global/matlab.research/bin/glnxa64
> -L/usr/global/matlab.research/bin/glnxa64 -lmx -lmex -lmat -lm
> -lstdc++
>
> matlab -nosplash -nodesktop -r "test_mex;quit"
>
> and got:
> PGFIO-F-254/formatted write/internal file/illegal repeat count in
> format.
> In source file test_mex.f90, at line number 12

I could reproduce this problem. Try starting with the options I had, and start
adding one at a time to see where the problem occurs. I would not use OpenMP. I
have had problems with PGI openmp and matlab. They only way I've gotten it to
work is to not set the environment variable and explicitly call
omp_set_num_threads from the compiled code. Also, try not linking with the
Matlab libraries and stdc++.

>
> I found that removing the -Wl,--version-script,....fexport.map option
> fixes the problem with the test file but unfortunately removing this
> option from the compilation of my application does not fix the
> problem.
>
> I altered the mexopts.sh file and made other changes in an attempt to
> minimize the arguments passed to the compiler but this does not fix
> the problem.
>
> The routine in question is now compiled as:
> pgf90 -c -I../../include -byteswapio -fpic rrtm_init.F
>
> ar ru ../../mm5libs/librad.a swrad.o lwrad.o mm5atm.o rrtm.o
> rrtm_gasabs.o rrtm_init.o rrtm_k_g.o rrtm_rtrn.o rrtm_setcoef.o
> rrtm_taumol.o inirad.o o3data.o dm_io.o
>
> And then the creation of the .mexa64 includes statements such as:
> pgf90 -c -I../include -fPIC -Kieee ../src/util/matlabdriver.F
> pgf90 -c -I../include -fPIC -Kieee ../src/util/driver.F
> pgf90 -shared -o radiationmatlab.mexa64 matlabdriver.o driver.o
> libutil.a librad.a
>
> With these options for the compilation of the various parts of the
> program the error still occurs. However, the problem occurs
> intermittently and thus far I cannot determine why. The program
> calls various .mexa64 files (that all use the same mexFunction
> subroutine) and now some of the them will allow WRITE statements but
> others will not. Additionally, I can run the one .mexa64 repeatedly
> and the WRITE statement will work every time but if I call it and
> then call an identical .mexa64 of a different name with the same
> inputs the WRITE statement will again cause the "illegal repeat
> count" error.

PGI does not have Matlab, so they cannot reproduce this. Can you make this occur
in standalone code? You could compile the mex function as a shared library, and
then just write standalone programs that make the repeated calls. Note that in
this case, you can't use any of the mx or mex routines.

>
> Following is a simplified version of my mexFunction file. In the
> actual file every variable with a 1 (one) in it also occurs with 2,
> 3, 4, and 5 and there are identical statements for each of these.
> Also the actual file calls another Fortran routine to do the actual
> work but I've found that even with this call removed the problems
> still occur and so I've omitted the call for simplicity. Perhaps
> there is something I'm doing that is obviously incorrect.
>
> subroutine mexFunction(nlhs, plhs, nrhs, prhs)
> #include <fintrf.h>
> MWPOINTER plhs(*), prhs(*)
> MWPOINTER mxGetPr, mxCreateDoubleMatrix
> MWPOINTER x1_pr, y1_pr
> integer nlhs, nrhs
> integer mxGetM, mxGetN, mxIsNumeric
> integer mmrhs1, nnrhs1, sizerhs1
> integer mmlhs1, nnlhs1, sizelhs1
> integer maxxy
> parameter (maxxy=4000)
> real*8 x1(maxxy), y1(maxxy)
> CHARACTER (LEN=80) debugout
>
> write(debugout,'A11'),'Hello World'
> call mexprintf(debugout)
>
> mmrhs1 = mxGetM(prhs(1))
> nnrhs1 = mxGetN(prhs(1))
> sizerhs1 = mmrhs1*nnrhs1
>
> mmlhs1=mmrhs1
> nnlhs1=nnrhs1
> sizelhs1=mmlhs1*nnlhs1
> if(mxIsNumeric(prhs(1)) .eq. 0) then
> call mexErrMsgTxt('Input 1 must be a number.')
> endif
>
> plhs(1) = mxCreateDoubleMatrix(mmlhs1,nnlhs1,0)
> y1_pr = mxGetPr(plhs(1))
> x1_pr = mxGetPr(prhs(1))
> call mxCopyPtrToReal8(x1_pr,x1,sizerhs1)
> call mxCopyReal8ToPtr(y1,y1_pr,sizelhs1)
>
> return
> end
>
> Any ideas on what I might be doing wrong would be greatly
> appreciated.
>
> Thanks,
> Brian R.
From: B. Reen on
Christopher Hulbert wrote:
>I could reproduce this problem. Try starting with the options I had,
and start
>adding one at a time to see where the problem occurs. I would not
use OpenMP. I
>have had problems with PGI openmp and matlab. They only way I've
gotten it to
>work is to not set the environment variable and explicitly call
>omp_set_num_threads from the compiled code. Also, try not linking
with the
>Matlab libraries and stdc++.

As noted I found that my option:
-Wl,--version-script,/usr/global/matlab.research/extern/lib/glnxa64/fe
xport.map
was the one causing the problem to occur with your test code.
However removing this compile option did not resolve the problem with
my code. I also eliminated almost all compiler options used in my
code and the problem still occurs with my code.

As noted, I can reproduce my problem using your example code and
compiler options, which seems to be an easier way to attack the
problem. I may not have been clear on this in my previous posts so
I'll restate it. I can run your code without problem, but if I copy
the .mexa64 file so that I have test_mex.mexa64 and test_mex2.mexa64,
which are identical, I have the problem with whichever file is called
second. If I issue a clear command for the first mex file before
running the second file the problem does not occur. So here's what I
see:

[reenb(a)jupiter mm5libs]$ cat test_mex.f90
subroutine mexfunction(nlhs,plhs,nrhs,prhs)
integer :: nlhs,nrhs,err
integer(8) :: plhs(nlhs),prhs(nrhs)
character(len=80) :: debugout

interface
integer function mexprintf(str)
character(*) :: str
end function
end interface

write(debugout,'A17'),'Reading RRTM_DATA'
err = mexprintf(debugout)

end subroutine
[reenb(a)jupiter mm5libs]$ pgf90 -V

pgf90 6.1-6 64-bit target on x86-64 Linux
Copyright 1989-2000, The Portland Group, Inc. All Rights Reserved.
Copyright 2000-2006, STMicroelectronics, Inc. All Rights Reserved.
[reenb(a)jupiter mm5libs]$ pgf90 -shared -otest_mex.mexa64 test_mex.f90
[reenb(a)jupiter mm5libs]$ cp test_mex.mexa64 test_mex2.mexa64
[reenb(a)jupiter mm5libs]$ matlab -nodesktop -nosplash

< M A T L A B >
Copyright 1984-2005 The MathWorks, Inc.
Version 7.1.0.183 (R14) Service Pack 3
August 02, 2005

To get started, type one of these: helpwin, helpdesk, or demo.
For product information, visit www.mathworks.com.

>> test_mex
Reading RRTM_DATA>> test_mex
Reading RRTM_DATA>> clear test_mex
>> test_mex2
Reading RRTM_DATA>> test_mex2
Reading RRTM_DATA>> test_mex
PGFIO-F-254/formatted write/internal file/illegal repeat count in
format.
In source file test_mex.f90, at line number 12
[reenb(a)jupiter mm5libs]$


This problem seems to be the basis for my code's problem. Namely my
application has repeated calls to .mexa64 files with identical
mexfunction's (differences in what is run is based on variables
passed to mexfunction that are subsequently passed to a subroutine
that calls different subroutines based on input parameters). Also, I
find that compiling with Intel's Fortran compiler on the same 64-bit
machine resolves the issue with the test file (after a slight
alteration to the format statement which makes no difference for
PGF90). I have not yet tried to use the Intel compiler for my code,
but I anticipate that it may not be as straightforward.

For Intel Fortran I compile with:
ifort -shared -fpic -otest_mex.mexa64 test_mexi.f90
The version is:
Intel(R) Fortran Compiler for Intel(R) EM64T-based applications,
Version 9.0 Build 20050430 Package ID: l_fc_p_9.0.021

The problem does not occur on the 32-bit side, however the version of
Matlab available to me on the 32-bit side is different:
pgf90 6.1-6 32-bit target on x86 Linux
Matlab version: Version 7.2.0.294 (R2006a)

>PGI does not have Matlab, so they cannot reproduce this. Can you
make this occur
>in standalone code? You could compile the mex function as a shared
library, and
>then just write standalone programs that make the repeated calls.
Note that in
>this case, you can't use any of the mx or mex routines.

I have not been able to make this occur in standalone code. I wrote
a driver program to call the test program and executed the two copies
of this without problem.

Thanks,
B. Reen
From: Christopher Hulbert on
* Snip fortran mex problems with PGI compilers *


I was able to reproduce the problem outside of Matlab. I have posted the example
code to PGI on their forum. I would continue to follow the post using the link
below.

http://www.pgroup.com/userforum/viewtopic.php?t=769
First  |  Prev  | 
Pages: 1 2
Prev: nlinfit
Next: Content Based Image Retrieval