hashkill icon indicating copy to clipboard operation
hashkill copied to clipboard

Strange compile error

Open nikolavp opened this issue 12 years ago • 10 comments

After

./configure
make

I got the following error.

Compiling nvidia_bfunix without flags...
clBuildProgram(): CL_INVALID_BINARY
Log:
==============
ptxas error   : Entry function 'bfunix' uses too much shared data (0x8070 bytes + 0x10 bytes system, 0x4000 max)

I am not sure If I am doing something wrong here(should I call make without some of the features) or this is an issue. I am currently using Ubuntu 12.10.

Strangely enough I am using Archlinux at home and I was able to compile it there. Please let me know what you think.

nikolavp avatar Oct 23 '12 08:10 nikolavp

Hello,

Could you please provide the GPUs installed at both places and the driver versions if possible?

gat3way avatar Oct 23 '12 16:10 gat3way

Ok sorry for the huge delay I completely forgot about this issue :(

Here is a gist from lspci -vv on the working and not working machine. Tell me if you need something else.

nikolavp avatar Nov 26 '12 13:11 nikolavp

I now see the problem. The nvidia bfunix kernel indeed uses 32KB of local memory while the oldest supported Nvidia devices have just 16KB per compute unit. Interestingly though, the cross-compiler happily compiles the sm10 kernel binaries on my system.

Solution will be to either have a separate sm10 kernel or disable bfunix for old nvidia gpus. Disabling is probably the better solution as it would be very slow anyway, quite likely much slower than the speed on a modern CPU.

gat3way avatar Nov 26 '12 14:11 gat3way

Well as far as I can understand the GPU on this desktop machine is pretty bad and there are assumptions for bigger memory in the GPU? If this is the case can you provide a compile switch to disable the GPU support altogether or give me hints where I can disable it.

I will be glad to provide a patch if I can ;)

Best, Nikola

nikolavp avatar Nov 26 '12 17:11 nikolavp

Simplest workaround would be to open src/kernels/nvidia_bfunix.cl and put at the top:

#ifndef SM10

and at the bottom:

#endif

Not the best solution (no warning about bcrypt not supported on that gpu) but at least it would compile.

gat3way avatar Nov 26 '12 17:11 gat3way

Fixed in git now.

gat3way avatar Dec 08 '12 22:12 gat3way

I pretty much had the same error compiling version 0.3.1.:

Compiling nvidia_bfunix without flags...
Log:
==============
ptxas error   : Entry function 'bfunix' uses too much shared data (0x8070 bytes + 0x10 bytes system, 0x4000 max)

Here is my VGA's "lspci -vv": https://gist.github.com/gabrielmagno/7257498

gabrielmagno avatar Oct 31 '13 21:10 gabrielmagno

That's not good. Looks like other non-sm_10 devices lack enough shared memory. Please use the #ifdef solution until I implement a proper fix for that (which as minor as it sounds won't happen the next few days, I am preoccupied with the a51 stuff right now :( )

gat3way avatar Oct 31 '13 21:10 gat3way

I created this patch, that fixed the issue (at least for my device): https://gist.github.com/gabrielmagno/7265841

I've used clGetDeviceInfo with CL_DEVICE_LOCAL_MEM_SIZE to get the amount of local memory. Then, I conceived the flag LOCMEM16K, that is inserted into the compiler flag if the device has exatcly 16K of local memory. Finally, I inserted an #ifndef LOCMEM16K in nvidia_bfunix.cl.

gabrielmagno avatar Nov 01 '13 14:11 gabrielmagno

Great job!

I will integrate it as soon as I have some time (perhaps tonight or tomorrow).

Thanks!

gat3way avatar Nov 04 '13 10:11 gat3way