llvm-ptx-samples icon indicating copy to clipboard operation
llvm-ptx-samples copied to clipboard

ran into trouble when running the samples

Open cos opened this issue 13 years ago • 3 comments

Hello,

I'm running into trouble trying to create a kernel object from ptx:

Assertion failed: (result == CL_SUCCESS && "Failed to extract kernel"), function initialize, file /Users/cos/Projects/cs526/test2/llvm-ptx-samples/opencl/matmul/matmul.cpp, line 75. Abort trap: 6

So it works for the lc but not for the ptx...

I'm using LLVM+Clang 3.2 on OS X.

Any ideas?

Thanks!

cos avatar Apr 22 '12 00:04 cos

Interesting. I just tried with the latest LLVM/Clang sources, and it works for me on RHEL 6. I wonder if the NVIDIA OpenCL stack on Mac uses a different OpenCL binary format. Unfortunately, I do not have an NVIDIA Mac to test.

Could you dump the OpenCL binary from a working program and send it to me? My feeling is that the Mac implementation is wrapping the PTX inside of some kind of container.

To get the binary, you can use clGetProgramInfo. Something like:

cl_uint NumDevices;
clGetProgramInfo(program, CL_PROGRAM_NUM_DEVICES, sizeof(cl_uint), &NumDevices, NULL);
if (NumDevices == 0) {
  std::cerr << "No binary found!\n";
  return 1;
}

size_t BinarySizes[NumDevices];
clGetProgramInfo(program, CL_PROGRAM_BINARY_SIZES, NumDevices*sizeof(size_t), BinarySizes, NULL);

char** Binaries = new char*[NumDevices];
for(size_t i = 0; i < NumDevices; ++i) {
  Binaries[i] = new char[BinarySizes[i]+1];
}

clGetProgramInfo(program, CL_PROGRAM_BINARIES, NumDevices*sizeof(size_t), Binaries, NULL);

std::ofstream myfile("kernel.bin", std::ios::binary);
myfile.write(Binaries[0], BinarySizes[0]);
myfile.close();

jholewinski avatar Apr 22 '12 01:04 jholewinski

Thanks for the quick response! I am still learning how to use OpenCL so I'll probably give you more (hopefully not different) info than needed to solve the problem.

I've added your code twice to the initialize() method of MatMulSample. I've also commented the failed assertion. https://gist.github.com/2464419

The result is this:

------------------------------
* Source Kernel
------------------------------
Assertion failed: (result == CL_SUCCESS && "Unable to get profiling information"), function timeKernel, file /Users/cos/Projects/cs526/test2/llvm-ptx-samples/common/OCLSample.cpp, line 97.
Abort trap: 6

The two binary files generated are: http://dl.dropbox.com/u/40561621/kernel_cl.bin http://dl.dropbox.com/u/40561621/kernel_bin.bin

cos avatar Apr 22 '12 15:04 cos

Yup, that's as I feared. On Mac, it looks like the kernel is compiled down to SASS (device assembly), and embedded in an Apple PList. If you're feeling adventurous, you can use ptxas (part of the CUDA Toolkit) to compile the generated PTX into SASS, which should give you an ELF object (by default, elf.o). If you take your kernel_bin.bin file, open it in a PList editor (like PListEdit Pro), and replace the clBinaryData entry with the contents of the elf.o file, you may be able to load that as an OpenCL binary.

In other words, it looks like doing manual OpenCL compilation on Mac is going to be a pain! :)

Unfortunately, as I do not have any NVIDIA Macs I cannot test this. If I get some time, I may try to create the PList file as part of the build process on Mac, but I'll need testers!

Regarding the new assertion failure, the OpenCL run-time is not able to load the binary, so I wouldn't expect anything to work after that.

jholewinski avatar Apr 22 '12 16:04 jholewinski