winlibs_mingw icon indicating copy to clipboard operation
winlibs_mingw copied to clipboard

libgomp-plugin-nvptx.so.1 Not found

Open Panjaksli opened this issue 3 years ago • 18 comments

GCC10.3 latest version... I need PTX offloading, and infact it's really enabled, but Im getting this error libgomp: while loading libgomp-plugin-nvptx.so.1: "libgomp-plugin-nvptx.so.1": Module not found, it compiles, but doesnt work. Do I have to install CUDA or something ?

Panjaksli avatar Apr 17 '21 18:04 Panjaksli

Weird, and now it came down to offloading not supported... is the flag -foffload=nvptx-none correct ? In case I'm not making sense, could someone please tell me how to setup OpenMP with PTX computation ? (Ideally in Codeblocks)

Panjaksli avatar Apr 17 '21 19:04 Panjaksli

The offloading stuff is compiled in and part of the package (see folder mingw64/libexec/gcc/x86_64-w64-mingw32/10.3.0/accel/nvptx-none for 64-bit). But your error seems to indicate it doesn't have libgomp with offloading. I need to check that. Also it should be looking for .dll files on windows, not .so (or .so.1) files... Do you have a small test source code I can use to test building with offloading support?

brechtsanders avatar Apr 18 '21 15:04 brechtsanders

I think you are right, the nvptx acceleration wasn't configured properly. Can you test again with https://github.com/brechtsanders/winlibs_mingw/releases/tag/10.3.0-11.1.0-8.0.0-r2 ?

brechtsanders avatar Apr 18 '21 21:04 brechtsanders

I think you are right, the nvptx acceleration wasn't configured properly. Can you test again with https://github.com/brechtsanders/winlibs_mingw/releases/tag/10.3.0-11.1.0-8.0.0-r2 ?

It did compile this time, but still getting the "libgomp-plugin-nvptx.so.1" not found And while trying to use target on OMP it says mkoffload.exe" returned exit status 1 As for the testing code; (dont mind the printf, its CPP, also the delete the space before iostream, because it disappears in the text here.)

#include < iostream> #include <omp.h> //params: -fopenmp -foffload=nvptx-none using namespace std;

int main() { int threads=omp_get_max_threads(); int devices=omp_get_num_devices(); int i,j[100]; printf("Threads: %d Devices: %d\n",threads,devices); //#pragma omp parallel for shared(j) //CPU #pragma omp target teams distribute parallel for shared(j) //GPU offload for(i=0;i<100;i++) //j[i]=omp_get_thread_num(); //CPU j[i]=omp_get_team_num(); //GPU for(i=0;i<100;i++) printf("%d ",j[i]); return 0; }

Update. Looks like someone was having troubles before too. Maybe it helps...

"Note that OpenMP offloading in GCC uses additional code generation conventions on top of "standard" PTX conventions (see documentation for the option '-mgomp' in the manual), and for that reason support libraries are multilibbed: installed tree has an mgomp/ subdirectory with versions of libgomp.a, libgcc.a and others for OpenMP offloading. mkoffload also knows to select the multilibs by passing -mgomp when linking for OpenMP (i.e. when -fopenmp flag is present). If you configured with --disable-multilib, or accidentally have old mkoffload or incomplete install tree, that would explain the problem."

quoted from: https://gcc.gnu.org/legacy-ml/gcc-help/2020-01/msg00104.html

Panjaksli avatar Apr 20 '21 12:04 Panjaksli

libgomp-plugin-nvptx.so.1 isn't right at all. The correct file for Windows is: mingw64/bin/libgomp-plugin-nvptx-1.dll somehow it's looking for the wrong file whose name looks like a Linux/Unix shared object, not a Windows DLL.

I tried your example. Compiling worked:

g++ -c -o test.o  -fopenmp -foffload=nvptx-none test.cpp

to get it to link I had to use (somehow it didn't work with LTO):

g++ -o test.exe test.o -fno-lto -lgomp

The result looks like this:

./test.exe
libgomp: while loading libgomp-plugin-nvptx.so.1: "libgomp-plugin-nvptx.so.1": The specified module could not be found.
Threads: 32 Devices: 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

So I can reproduce your issue.

I copied libgomp-plugin-nvptx-1.dll to libgomp-plugin-nvptx.so.1 in the winlibs mingw64/bin folder and ran the test again and then there was no error. That's not a clean solution though as this really is a .dll file.

I have submitted a bug report to GCC, see: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100160

brechtsanders avatar Apr 20 '21 18:04 brechtsanders

libgomp-plugin-nvptx.so.1 isn't right at all. The correct file for Windows is: mingw64/bin/libgomp-plugin-nvptx-1.dll somehow it's looking for the wrong file whose name looks like a Linux/Unix shared object, not a Windows DLL.

I tried your example. Compiling worked:

g++ -c -o test.o  -fopenmp -foffload=nvptx-none test.cpp

to get it to link I had to use (somehow it didn't work with LTO):

g++ -o test.exe test.o -fno-lto -lgomp

The result looks like this:

./test.exe
libgomp: while loading libgomp-plugin-nvptx.so.1: "libgomp-plugin-nvptx.so.1": The specified module could not be found.
Threads: 32 Devices: 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

So I can reproduce your issue. I have submitted a bug report to GCC, see: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100160

Nice, it links, but offloading doesn't work. omp_get_num_devices(); should return at least 1 (GPU). And testing it on the mandelbrot set, the "omp target" still uses CPU.

Panjaksli avatar Apr 20 '21 18:04 Panjaksli

I'm not an export in this area. Maybe CUDA is needed after all? Do you have it installed?

brechtsanders avatar Apr 20 '21 19:04 brechtsanders

I'm not an export in this area. Maybe CUDA is needed after all? Do you have it installed?

Just last question: did you turn off --disable-multilib while setting up the Nvptx in GCC ? From the little I found, that appeared to be the issue, when someone was compiling GCC with Nvptx. Anyways... I went for the Clang, it wanted cuda, so I installed it, however, I'm having trouble specifying its path/Clang has trouble detecting it... Gonna try older CUDA next.

Panjaksli avatar Apr 20 '21 20:04 Panjaksli

I don't have anything installed on my system for CUDA, unless it comes with the nVidia drivers.

I will add --disable-multilib to the build process for the gcc nvptx offload engine.

I also found that there seems to be a nvptx64-nvidia-cuda target, but GCC doesn't know it. I guess that's the LLVM/CLang target you mentioned that requires CUDA.

brechtsanders avatar Apr 20 '21 21:04 brechtsanders

I don't have anything installed on my system for CUDA, unless it comes with the nVidia drivers.

I will add --disable-multilib to the build process for the gcc nvptx offload engine.

I also found that there seems to be a nvptx64-nvidia-cuda target, but GCC doesn't know it. I guess that's the LLVM/CLang target you mentioned that requires CUDA.

I meant that --disable-multilib must NOT be in there, if you had it there.

Also, I've tried to use -fopenacc without specifying any offload , it gives this awesome error: lto1.exe: error: '-foffload-abi' option can be specified only for offload compiler And without -foffload but only -fopenmp I get this extra error lto1.exe: error: unrecognized command-line option '-mgomp'

Just throwing ideas at you... :P (I really know nothing about compilers)

Panjaksli avatar Apr 21 '21 08:04 Panjaksli

No, --disable-multilib wasn't in there. But to be safe I will explicitly add --enable-multilib. I also noticed the issue in lto1.exe, which you can avoid with -fno-lto (see my gcc bug report). Let's just wait and see what comes out of that bug report.

brechtsanders avatar Apr 21 '21 13:04 brechtsanders

No, --disable-multilib wasn't in there. But to be safe I will explicitly add --enable-multilib. I also noticed the issue in lto1.exe, which you can avoid with -fno-lto (see my gcc bug report). Let's just wait and see what comes out of that bug report.

Alrighty then. Thanks for everything!

Panjaksli avatar Apr 21 '21 13:04 Panjaksli

No progress yet as the recently released GCC 11 doesn't build properly yet for nvptx (see my gcc bug report)

brechtsanders avatar May 02 '21 21:05 brechtsanders

in libgomp's target.c : const char *prefix ="libgomp-plugin-"; const char *suffix = SONAME_SUFFIX (1); in plugin-suffix.h (there are several platform specific) none define .dll furthermore in plugin-nvptx.c it being hardcoded : const char *cuda_runtime_lib = "libcuda.so.1";

tumagonx avatar Dec 08 '21 19:12 tumagonx

@tumagonx Looks like you found a bug in GCC. Has it been reported with GCC development yet?

brechtsanders avatar Dec 08 '21 19:12 brechtsanders

nah, I tired of reporting bug over windows compatibility (been there). and since I'm still on windows xp, it will getting worse.

tumagonx avatar Dec 08 '21 19:12 tumagonx

by the way, if we enable this plugin, does that means we no longer needs for cuda toolkit (nvcc, ptxas and friends), just need cuda runtime/driver? need to support unofficially cuda 8 on XP (there is 32bit runtime/driver, but no toolkit)

edit: nevermind, with cudart32_80.dll, cupti32_80.dll and cuinj32_80.dll not much can be done on xp except pure cuda development (no primitive library support) and its seems like neither gcc/llvm have their own ptx assembler, so they are nvcc replacement only. Not sure about nvptx-tools (need newlib so may not for windows too).

tumagonx avatar Dec 08 '21 20:12 tumagonx

FYI: I am able to build nvptx-tools from https://github.com/MentorEmbedded/nvptx-tools and nvptx-tools https://github.com/MentorEmbedded/nvptx-tools withoud CUDA.

brechtsanders avatar Dec 12 '21 16:12 brechtsanders