From: Brian R on
I'm utilizing mex to run existing Fortran code in Matlab using the
Portland Group compiler PGF90. The code has worked fine in a 32-bit
Linux environment but now when I try to use it in a 64-bit Linux
environment READ and WRITE statements no longer work. Compilation
completes without error and if I avoid running the subroutines with
READ and WRITE statements the program seems to be function without
error. Note that the READ and WRITE statements are NOT trying to use
standard input/output.

As an example of a failing WRITE statement:
write(debugout,'A17'),'Reading RRTM_DATA'
where debugout is defined as:
CHARACTER (LEN=80) debugout

Running in 32-bit Linux this puts the text 'Reading RRTM_DATA' into
debugout so that I could later view it with a call to mexprintf but
in 64-bit Linux the write statement crashes Matlab and gives an error
that seems to originate from the compiler:
PGFIO-F-254/formatted write/internal file/illegal repeat count in
format.

I've also tested the line in standalone code (separate from Matlab)
on the 64-bit machine and it works correctly.

In attempting to read data from a file on the 64-bit side, the
Fortran OPEN statement seen below results in an error:
OPEN(INUNIT,FILE='RRTM_DATA',
FORM='UNFORMATTED',STATUS='OLD',ERR=9010)

This line results in line 9010 running but I have not been able to
determine the exact nature of the error since I cannot utilize WRITE
statements to put the relevant data into a string for output. The
exact same code opens the exact same file and reads it without
problem on the 32-bit side.

Any ideas on what might be the problem?

Thanks.
From: Christopher Hulbert on
Brian R wrote:
> I'm utilizing mex to run existing Fortran code in Matlab using the
> Portland Group compiler PGF90. The code has worked fine in a 32-bit
> Linux environment but now when I try to use it in a 64-bit Linux
> environment READ and WRITE statements no longer work. Compilation
> completes without error and if I avoid running the subroutines with
> READ and WRITE statements the program seems to be function without
> error. Note that the READ and WRITE statements are NOT trying to use
> standard input/output.
>
> As an example of a failing WRITE statement:
> write(debugout,'A17'),'Reading RRTM_DATA'
> where debugout is defined as:
> CHARACTER (LEN=80) debugout
>
> Running in 32-bit Linux this puts the text 'Reading RRTM_DATA' into
> debugout so that I could later view it with a call to mexprintf but
> in 64-bit Linux the write statement crashes Matlab and gives an error
> that seems to originate from the compiler:
> PGFIO-F-254/formatted write/internal file/illegal repeat count in
> format.
>
> I've also tested the line in standalone code (separate from Matlab)
> on the 64-bit machine and it works correctly.
>
> In attempting to read data from a file on the 64-bit side, the
> Fortran OPEN statement seen below results in an error:
> OPEN(INUNIT,FILE='RRTM_DATA',
> FORM='UNFORMATTED',STATUS='OLD',ERR=9010)
>
> This line results in line 9010 running but I have not been able to
> determine the exact nature of the error since I cannot utilize WRITE
> statements to put the relevant data into a string for output. The
> exact same code opens the exact same file and reads it without
> problem on the 32-bit side.
>
> Any ideas on what might be the problem?

Works fine on my setup. Can you start giving some versions. In a lot of fortran
code I've modified to work in 64-bit matlab, I've had to pay special attention
when integers are holding pointer values. They are 8 bytes in 64-bit and 4-bytes
in 32. That has been the cause of a lot of debugging. Also, note that for many
linux setups, you can write to stdout directly (e.g. write(*,'A17')) when in
terminal mode. I'm not sure about GUI mode.

[chulbert(a)mellin test]$ cat test_mex.f90
subroutine mexfunction(nlhs,plhs,nrhs,prhs)
integer :: nlhs,nrhs,err
integer(8) :: plhs(nlhs),prhs(nrhs)
character(len=80) :: debugout

interface
integer function mexprintf(str)
character(*) :: str
end function
end interface

write(debugout,'A17'),'Reading RRTM_DATA'
err = mexprintf(debugout)

end subroutine
[chulbert(a)mellin test]$ pgf90 -V

pgf90 6.2-5 64-bit target on x86-64 Linux
Copyright 1989-2000, The Portland Group, Inc. All Rights Reserved.
Copyright 2000-2006, STMicroelectronics, Inc. All Rights Reserved.
[chulbert(a)mellin test]$ pgf90 -shared -otest_mex.mexa64 test_mex.f90
[chulbert(a)mellin test]$ matlab -nosplash -nodesktop -r "test_mex;quit"
LD_LIBRARY_PATH=/opt/pgi/linux86-64/6.2/libso:/opt/intel/cce/9.0/lib:/apps/pgi/linux86-64/6.2/libso:
Warning: Unable to open display , MATLAB is starting without a display.
You will not be able to display graphics on the screen.
Warning: No window system found. Java option 'MWT' ignored

< M A T L A B >
Copyright 1984-2006 The MathWorks, Inc.
Version 7.3.0.298 (R2006b)
August 03, 2006


To get started, type one of these: helpwin, helpdesk, or demo.
For product information, visit www.mathworks.com.

Reading RRTM_DATA

>
> Thanks.
From: Brian R. on
Thanks for your help. I tried your test and the test also worked
correctly on my system.

Regarding versions, on the 64-bit side:
Matlab = Version 7.1.0.183 (R14) Service Pack 3
PGF = pgf90 6.1-6 64-bit target on x86-64 Linux

I am aware of the pointer issue and I believe (although perhaps I am
wrong) I have fixed all of these in the code (the code was failing in
other manners on the 64-bit side before the pointers were corrected).


The compiler options you used differed from what I was using
significantly (many of my compile options were in the mexopts.sh). I
tried using the same options as I used on my application on the test
file you created and the error occurs with the test file. Namely I
did the following:

pgf90 -c -I../include -I/usr/global/matlab.research/extern/include
-fPIC -Kieee -g test_mex.f90

pgf90 -g -mp -shared
-Wl,--version-script,/usr/global/matlab.research/extern/lib/glnxa64/fe
xport.map -o test_mex.mexa64 test_mex.o
/usr/global/matlab.research/extern/lib/glnxa64/version4.o
-Wl,-rpath-link,/usr/global/matlab.research/bin/glnxa64
-L/usr/global/matlab.research/bin/glnxa64 -lmx -lmex -lmat -lm
-lstdc++

matlab -nosplash -nodesktop -r "test_mex;quit"

and got:
PGFIO-F-254/formatted write/internal file/illegal repeat count in
format.
In source file test_mex.f90, at line number 12

I found that removing the -Wl,--version-script,....fexport.map option
fixes the problem with the test file but unfortunately removing this
option from the compilation of my application does not fix the
problem.

I altered the mexopts.sh file and made other changes in an attempt to
minimize the arguments passed to the compiler but this does not fix
the problem.

The routine in question is now compiled as:
pgf90 -c -I../../include -byteswapio -fpic rrtm_init.F

ar ru ../../mm5libs/librad.a swrad.o lwrad.o mm5atm.o rrtm.o
rrtm_gasabs.o rrtm_init.o rrtm_k_g.o rrtm_rtrn.o rrtm_setcoef.o
rrtm_taumol.o inirad.o o3data.o dm_io.o

And then the creation of the .mexa64 includes statements such as:
pgf90 -c -I../include -fPIC -Kieee ../src/util/matlabdriver.F
pgf90 -c -I../include -fPIC -Kieee ../src/util/driver.F
pgf90 -shared -o radiationmatlab.mexa64 matlabdriver.o driver.o
libutil.a librad.a

With these options for the compilation of the various parts of the
program the error still occurs. However, the problem occurs
intermittently and thus far I cannot determine why. The program
calls various .mexa64 files (that all use the same mexFunction
subroutine) and now some of the them will allow WRITE statements but
others will not. Additionally, I can run the one .mexa64 repeatedly
and the WRITE statement will work every time but if I call it and
then call an identical .mexa64 of a different name with the same
inputs the WRITE statement will again cause the "illegal repeat
count" error.

Following is a simplified version of my mexFunction file. In the
actual file every variable with a 1 (one) in it also occurs with 2,
3, 4, and 5 and there are identical statements for each of these.
Also the actual file calls another Fortran routine to do the actual
work but I've found that even with this call removed the problems
still occur and so I've omitted the call for simplicity. Perhaps
there is something I'm doing that is obviously incorrect.

subroutine mexFunction(nlhs, plhs, nrhs, prhs)
#include <fintrf.h>
MWPOINTER plhs(*), prhs(*)
MWPOINTER mxGetPr, mxCreateDoubleMatrix
MWPOINTER x1_pr, y1_pr
integer nlhs, nrhs
integer mxGetM, mxGetN, mxIsNumeric
integer mmrhs1, nnrhs1, sizerhs1
integer mmlhs1, nnlhs1, sizelhs1
integer maxxy
parameter (maxxy=4000)
real*8 x1(maxxy), y1(maxxy)
CHARACTER (LEN=80) debugout

write(debugout,'A11'),'Hello World'
call mexprintf(debugout)

mmrhs1 = mxGetM(prhs(1))
nnrhs1 = mxGetN(prhs(1))
sizerhs1 = mmrhs1*nnrhs1

mmlhs1=mmrhs1
nnlhs1=nnrhs1
sizelhs1=mmlhs1*nnlhs1
if(mxIsNumeric(prhs(1)) .eq. 0) then
call mexErrMsgTxt('Input 1 must be a number.')
endif

plhs(1) = mxCreateDoubleMatrix(mmlhs1,nnlhs1,0)
y1_pr = mxGetPr(plhs(1))
x1_pr = mxGetPr(prhs(1))
call mxCopyPtrToReal8(x1_pr,x1,sizerhs1)
call mxCopyReal8ToPtr(y1,y1_pr,sizelhs1)

return
end

Any ideas on what I might be doing wrong would be greatly
appreciated.

Thanks,
Brian R.
From: Brian R. on
To clarify the simplest case where I'm having problems, I have a mex
file that consists of only the mexFunction shown in the previous post
and the WRITE statement seems to work sometimes while other times
crashing Matlab with the "illegal repeat count in format" compiler
error.

As I noted before I can call this mex file repeatedly (filea) and it
will process the WRITE statement and subsequent call to mexprintf
without error. If I copy filea to a different name (fileb) and call
fileb that also works without errors. However, if I call filea and
then call fileb the WRITE statement will work correctly in the filea
call but cause the crash in the fileb call.

I have now found that if I issue a clear command on filea before
calling fileb the WRITE statement will function correctly in both
calls.

Hopefully this clarifies the issue.

Thanks.
From: B. Reen on
The problem can be reproduced with the simple code included in the
post by Christopher Hulbert. Compile the code as shown in that post
and then copy test_mex.mexa64 to test_mex2.mexa64. Start matlab and
then one can run test_mex as many times as desired without problems
or test_mex2 as many times as desired without problem. But if you
run test_mex and then try to run test_mex2 the "illegal repeat count
in format" error occurs and crashes Matlab.

Although running two copies of the same program may not make sense,
in my application the same mexFunction is used in different .mexa64
files to call a driver program that then runs different routines
depending on the .mexa64 file.
 |  Next  |  Last
Pages: 1 2
Prev: nlinfit
Next: Content Based Image Retrieval