Using loadlibrary with NVIDIA CUDA (CUBLAS and CUFFT) libraries [Matlab]

Prev: matlab output format problem
Next: Creating a polygon in matlab

From: Christopher on 11 Apr 2010 19:17

Hi,

I would like to use the NVIDIA CUDA (CUBLAS and CUFFT) libraries from within MATLAB using the loadlibrary command. I am running MATLAB R2010a on 64-bit Linux Ubuntu 9.10, gcc 4.4, with CUDA Toolkit 3.0. Here is what I get when I try to run the loadlibrary command:

>> loadlibrary('libcublas', '/usr/local/cuda/include/cublas.h');
??? Error using ==> loadlibrary at 368
Failed to preprocess the input file.
Output from preprocessor is:In file included from
/usr/local/cuda/include/vector_types.h:45,
from /usr/local/cuda/include/cuComplex.h:44,
from /usr/local/cuda/include/cublas.h:94:
/usr/local/cuda/include/host_defines.h:41:2: error: #error --- !!! UNSUPPORTED COMPILER
!!! ---

Apparently, the following check is failing:

#if !defined(__GNUC__) && !defined(_WIN32)

But, I would think that __GNUC__ should be defined. Has anyone had success using loadlibrary with CUBLAS or CUFFT? I can write a MEX-file and link it with these libraries. However, I would like to use loadlibrary as it would allow me to experiment with the libraries from MATLAB's interpreted environment. Any information would be appreciated.

Thanks in advance,
--Chris

From: Philip Borghesani on 13 Apr 2010 18:50

Christopher
Edit loadlibrary.m and modify the code block (near line 280) that looks like this to remove -U __GNUC__:
case 'GLNXA64'
cc='gcc -U __GNUC__ -m64';
thunk_build='%s %s %s "%s" -o "%s" -Wl,-E -shared -fPIC';

Post your results if this works for you. At one point this define was needed by the header parsing code but it may no longer be
necessary.

I suggest making a backup copy of the file or using save as to save to myloadlibary.m in the same directory. If you need to move
the file you will need to create a private subdirectory under the new location and copy the file prototypes.pl from
toolbox/matlab/general/private to there.

Phil

"Christopher " <camejia(a)remove.this.alum.mit.edu> wrote in message news:hptl9f$8cu$1(a)fred.mathworks.com...
> Hi,
>
> I would like to use the NVIDIA CUDA (CUBLAS and CUFFT) libraries from within MATLAB using the loadlibrary command. I am running
> MATLAB R2010a on 64-bit Linux Ubuntu 9.10, gcc 4.4, with CUDA Toolkit 3.0. Here is what I get when I try to run the loadlibrary
> command:
>
>>> loadlibrary('libcublas', '/usr/local/cuda/include/cublas.h');
> ??? Error using ==> loadlibrary at 368
> Failed to preprocess the input file.
> Output from preprocessor is:In file included from
> /usr/local/cuda/include/vector_types.h:45,
> from /usr/local/cuda/include/cuComplex.h:44,
> from /usr/local/cuda/include/cublas.h:94:
> /usr/local/cuda/include/host_defines.h:41:2: error: #error --- !!! UNSUPPORTED COMPILER
> !!! ---
>
> Apparently, the following check is failing:
>
> #if !defined(__GNUC__) && !defined(_WIN32)
>
> But, I would think that __GNUC__ should be defined. Has anyone had success using loadlibrary with CUBLAS or CUFFT? I can write a
> MEX-file and link it with these libraries. However, I would like to use loadlibrary as it would allow me to experiment with the
> libraries from MATLAB's interpreted environment. Any information would be appreciated.
>
> Thanks in advance,
> --Chris
>

From: Christopher on 15 Apr 2010 09:36

Phil,

Thanks, that fixed the loadlibrary problem. I'm getting a couple of warnings from loadlibrary from functions that try to return a complex type, but I'm not sure that I'll even need those functions. I haven't had a chance to try calling the library functions yet, but I'll add to this post if I have any issues.

Thanks again,
--Chris

"Philip Borghesani" <philip_borghesani(a)mathworks.spam> wrote in message <hq2sfm$945$1(a)fred.mathworks.com>...
> Christopher
> Edit loadlibrary.m and modify the code block (near line 280) that looks like this to remove -U __GNUC__:
> case 'GLNXA64'
> cc='gcc -U __GNUC__ -m64';
> thunk_build='%s %s %s "%s" -o "%s" -Wl,-E -shared -fPIC';
>
> Post your results if this works for you. At one point this define was needed by the header parsing code but it may no longer be
> necessary.
>
> I suggest making a backup copy of the file or using save as to save to myloadlibary.m in the same directory. If you need to move
> the file you will need to create a private subdirectory under the new location and copy the file prototypes.pl from
> toolbox/matlab/general/private to there.
>
> Phil
>
> "Christopher " <camejia(a)remove.this.alum.mit.edu> wrote in message news:hptl9f$8cu$1(a)fred.mathworks.com...
> > Hi,
> >
> > I would like to use the NVIDIA CUDA (CUBLAS and CUFFT) libraries from within MATLAB using the loadlibrary command. I am running
> > MATLAB R2010a on 64-bit Linux Ubuntu 9.10, gcc 4.4, with CUDA Toolkit 3.0. Here is what I get when I try to run the loadlibrary
> > command:
> >
> >>> loadlibrary('libcublas', '/usr/local/cuda/include/cublas.h');
> > ??? Error using ==> loadlibrary at 368
> > Failed to preprocess the input file.
> > Output from preprocessor is:In file included from
> > /usr/local/cuda/include/vector_types.h:45,
> > from /usr/local/cuda/include/cuComplex.h:44,
> > from /usr/local/cuda/include/cublas.h:94:
> > /usr/local/cuda/include/host_defines.h:41:2: error: #error --- !!! UNSUPPORTED COMPILER
> > !!! ---
> >
> > Apparently, the following check is failing:
> >
> > #if !defined(__GNUC__) && !defined(_WIN32)
> >
> > But, I would think that __GNUC__ should be defined. Has anyone had success using loadlibrary with CUBLAS or CUFFT? I can write a
> > MEX-file and link it with these libraries. However, I would like to use loadlibrary as it would allow me to experiment with the
> > libraries from MATLAB's interpreted environment. Any information would be appreciated.
> >
> > Thanks in advance,
> > --Chris
> >
>

From: Christopher on 24 Apr 2010 19:37

OK, now I'm running into a different problem, mostly due to my misunderstanding of how libpointers work. I'm trying to create a memory buffer on the GPU device, then write some data into that buffer and read the results back. Here is the main section of my code:

inHost = (1 : 4); % Test vector
outHost = zeros(size(inHost)); % Allocate space for final result
devPtr = libpointer('voidPtrPtr');
assert(~calllib('libcublas', 'cublasAlloc',length(inHost), 8, devPtr));
assert(~calllib('libcublas', 'cublasSetVector', length(inHost), 8, ...
inHost, 1, devPtr, 1));
assert(~calllib('libcublas', 'cublasGetVector', length(inHost), 8, ...
devPtr, 1, outHost, 1));
assert(~calllib('libcublas', 'cublasFree', devPtr));
disp(outHost);

Here are the relevant function signatures:
[uint32, voidPtrPtr] cublasAlloc(int32, int32, voidPtrPtr)
[uint32, voidPtr, voidPtr] cublasSetVector(int32, int32, voidPtr, int32, voidPtr, int32)
[uint32, voidPtr, voidPtr] cublasGetVector(int32, int32, voidPtr, int32, voidPtr, int32)
[uint32, voidPtr] cublasFree(voidPtr)

Hopefully, even if one doesn't have access to the CUBLAS library it's easy enough to understand how it's supposed to work. cublasAlloc allocates a memory buffer in the device's memory space, pointed to by devPtr. cublasGetVector and cublasSetVector copy data back and forth (also specifying the number of elements, element size, and strides for source and destination buffers). All functions return uint32 status that should be 0, and the voidPtr return arguments were added by the MATLAB wrapper.

The code runs without crashing and all return statuses are 0, but unfortunately the result outHost is still all zeros. I've consulted the documentation (http://www.mathworks.com/access/helpdesk/help/techdoc/matlab_external/f42650.html) but I'm having difficulty mapping those examples to my case. Part of my confusion seems to be that I have a pointer to a memory location in the device's memory space.

Any "pointers" would be appreciated,
--Chris

From: Christopher on 25 Apr 2010 13:53

After studying the documentation on libpointers some more, I was able to fix my code so that it works:

inHost = single(1 : 10); % Test data
devPtr = libpointer('voidPtr');
% Allocate space for final result
outHostPtr = libpointer('voidPtr', zeros(size(inHost), class(inHost)));
assert(~calllib('libcublas', 'cublasAlloc', length(inHost), 4, devPtr));
assert(~calllib('libcublas', 'cublasSetVector', length(inHost), 4, ...
inHost, 1, devPtr, 1));
calllib('libcublas', 'cublasSscal', (length(inHost)), 2.0, devPtr, 1);
assert(~calllib('libcublas', 'cublasGetError'))
assert(~calllib('libcublas', 'cublasGetVector', length(inHost), 4, ...
devPtr, 1, outHostPtr, 1));
assert(~calllib('libcublas', 'cublasFree', devPtr));
disp(outHostPtr.Value);

Note that I changed it to single-precision because that's what my CUDA card supports, and I added a call to cublasSscal to multiply the vector elements by two. I also changed devPtr from libpointer('voidPtr') to libpointer('voidPtrPtr'), because that seems to be the approach for allocation functions. Finally, I made an explicity libpointer "outHostPtr" to hold the output results, and realized that the result had to be accessed using the Value property.

I'm still not 100% confident in my usage, so if anyone sees potential problems or can think of simplifications or improvements, I'd be interested in hearing about them.

--Chris

| Next | Last
Pages: 1 2
Prev: matlab output format problem
Next: Creating a polygon in matlab