OpenChatKit
OpenChatKit copied to clipboard
Does it run on single nVidia RTX A4000?
Does it run on single nVidia RTX A4000 or do I need two or more?
It looks like, as of right now, you need at least 48GB RAM like an A100 80GB. There are people on the Discord server who have managed to run it on smaller GPUs by using multiple GPUs or on smaller GPUs by using quantization.
I would check out the Discord for more info/help
@exander77 you can find the Multi GPU discord thread here: https://discord.com/channels/1082503318624022589/1082510608123056158/1084210191635058759
I've had marginal luck using a 4090 with 24gb of ram. The trick was to not give it ALL of your memory because it will need some for data load and some for the processing. Quantization helped some too.
Some output from the 4090
I've had marginal luck using a 4090 with 24gb of ram. The trick was to not give it ALL of your memory because it will need some for data load and some for the processing. Quantization helped some too.
Yup. This issue was before the Pythia base model was out. This should run on GPUs >12 GB VRAM or <12 GB VRAM with offloading to CPU/disk.
The GPT-NeoX-20B model still requires 40 GB of memory to be loaded.
This issue can be closed. Yes, the Pythia model can run inference on a single Nvidia RTX A4000.
I've had marginal luck using a 4090 with 24gb of ram. The trick was to not give it ALL of your memory because it will need some for data load and some for the processing. Quantization helped some too.
Yup. This issue was before the Pythia base model was out. This should run on GPUs >12 GB VRAM or <12 GB VRAM with offloading to CPU/disk.
The GPT-NeoX-20B model still requires 40 GB of memory to be loaded.
This issue can be closed. Yes, the Pythia model can run inference on a single Nvidia RTX A4000.
I can confirm the Pythia model works on single Nvidia RTX A4000. I am figuring what I need for: GPT-NeoX-20B
GPU VRAM usage with Pythia model: 14595MiB / 16376MiB
I can confirm the Pythia model works on single Nvidia RTX A4000. I am figuring what I need for: GPT-NeoX-20B
GPU VRAM usage with Pythia model: 14595MiB / 16376MiB
That's great! You'll either need >40 GB VRAM (~20 in 8-bit) for the GPT-NeoX-20B model, or use cpu offloading to run it on your A4000 by adding the flag -g 0:14
(there's a PR up that should let you increase the 14 to 16).