Tim Dettmers
Tim Dettmers
The transformerion from col_ampere/col_turing to row-major is not supported by NVIDIA and is also not supported by my library. I will not implement it since it is a very complicated...
The transformer engine would be a perfect fit for this LLM.int8() algorithm. However, at this point, not enough details are known to say how big the advantage would be. Once...
Thanks for the reply @tomaarsen. Yes, the documentation on this is a bit lagging and it might be better to add this to the readme. I will add it with...
This is super helpful — thank you, everyone! I will add CUDA 11.8 as soon as possible!
CUDA 11.8 was added in the lastest release. I also added code that gives some compilation and debugging instructions if the CUDA setup fails.
I will work on this ASAP. Thank you for your patience.
Not much I can do here, your GPU does seem to run out of memory. It seems your attention is too large to fit into memory. I recommend using something...
I believe this issue is caused by a missing reference to the multi-arch path to the CUDA driver. You likely can solve the problem via calling `sudo ldconfig`. See [here](https://askubuntu.com/questions/350068/where-does-ubuntu-look-for-shared-libraries)...
I believe this is fixed in the latest version. It prints instructions on how to debug the situation and alternatively prints out compilation instructions which should fix the issue.
Thank you for the detailed issue. This was easy to replicate on my machine. This bug does not occur with `int8_threshold=0.0` which indicates the bug is likely some issue with...