pytorch_block_sparse
pytorch_block_sparse copied to clipboard
Using a smaller block size
Hi,
First of all thanks for setting up this package :) It's super helpful, thanks
I'm wondering, is there a way to use a smaller block size ? I tried modifying the python code so that no errors are thrown, however I'm hitting a
RuntimeError: CUDA error: an illegal memory access was encountered
error when calling the cuda kernel. I tried to look a bit into the kernel code, and it seems that the block_size argument is not used. So I'm curious how the kernel knows to expect a minimal size of 32.
Any clarifications would be super helpful!
Thanks