jetson-containers icon indicating copy to clipboard operation
jetson-containers copied to clipboard

GPU support for text generation web UI - Jetson Nano

Open thebigboss84 opened this issue 1 year ago • 8 comments

Hi , I'm seeing the message in terminal when running text generation web UI container , /usr/local/lib/python3.8/dist-packages/bitsandbytes/cextension.py:34: UserWarning: The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable. warn("The installed version of bitsandbytes was compiled without GPU support. " I want to try tinyLLMA or PHI so relatively small model , just trying to get familiar .

thebigboss84 avatar Oct 10 '23 04:10 thebigboss84

Hmm thanks, I test bitsandbytes GPU support in the container to make sure another version doesn't get installed, but let me double-check. Is it just a warning you can ignore, or does it end up in error?

llama.cpp is the fastest API/loader in text-generation-webui, followed by exllama.

dusty-nv avatar Oct 10 '23 04:10 dusty-nv

wow thanks for the fast response ! well I'm unable to load the model my jetson freezes , I have 4GB swap maybe I need to increase that , but also I saw this error RuntimeError: CUDA driver initialization failed, you might not have a CUDA gpu. to be sure this is the container im using ./run.sh $(./autotag text-generation-webui)

thebigboss84 avatar Oct 10 '23 05:10 thebigboss84

@thebigboss84 are you on Orin Nano or the original Jetson Nano, and what version of JetPack-L4T are you running?

dusty-nv avatar Oct 10 '23 05:10 dusty-nv

@dusty-nv I'm on jetson Nano dev kit 4gb , fresh installed from SD card ,nvidia-l4t-core 32.7.1-20220219090432

thebigboss84 avatar Oct 10 '23 06:10 thebigboss84

The text-generation-webui container is built for JetPack 5, the GPU in original Nano is too old to be supported by many of the packages. That isn't to say you couldn't get a bare-bones oogabooga container running with just PyTorch, HF Transformers, and text-generation-webui. I have a Transformers build for JetPack 4, but it doesn't have bitsandbytes quantization because that won't build for JetPack 4.

dusty-nv avatar Oct 10 '23 06:10 dusty-nv

Documentation saying it’s compatible that's why I didn't suspect issues .

Container images are compatible with other minor versions of JetPack/L4T: • L4T R32.7 containers can run on other versions of L4T R32.7 (JetPack 4.6+) • L4T R35.x containers can run on other versions of L4T R35.x (JetPack 5.1+)

thebigboss84 avatar Oct 10 '23 06:10 thebigboss84

@thebigboss84 yea, that says JetPack 4.6 containers can run on other versions of JetPack 4.6 and JetPack 5 containers can run on other versions of JetPack 5. Original Nano only supports JetPack 4. I guess the + at the end is confusing - sorry about that (made a note to clarify that)

dusty-nv avatar Oct 10 '23 12:10 dusty-nv

@dusty-nv Thank you for blazing response and clarification

thebigboss84 avatar Oct 10 '23 15:10 thebigboss84