OpenChatKit Does it run on single nVidia RTX A4000?

Does it run on single nVidia RTX A4000 or do I need two or more?

Mar 21 '23 11:03 exander77

It looks like, as of right now, you need at least 48GB RAM like an A100 80GB. There are people on the Discord server who have managed to run it on smaller GPUs by using multiple GPUs or on smaller GPUs by using quantization.

I would check out the Discord for more info/help

Mar 21 '23 23:03 orangetin

@exander77 you can find the Multi GPU discord thread here: https://discord.com/channels/1082503318624022589/1082510608123056158/1084210191635058759

Mar 28 '23 14:03 amaliako

I've had marginal luck using a 4090 with 24gb of ram. The trick was to not give it ALL of your memory because it will need some for data load and some for the processing. Quantization helped some too.

Mar 31 '23 06:03 joecodecreations

Some output from the 4090

Mar 31 '23 06:03 joecodecreations

I've had marginal luck using a 4090 with 24gb of ram. The trick was to not give it ALL of your memory because it will need some for data load and some for the processing. Quantization helped some too.

Yup. This issue was before the Pythia base model was out. This should run on GPUs >12 GB VRAM or <12 GB VRAM with offloading to CPU/disk.

The GPT-NeoX-20B model still requires 40 GB of memory to be loaded.

This issue can be closed. Yes, the Pythia model can run inference on a single Nvidia RTX A4000.

Mar 31 '23 13:03 orangetin

I've had marginal luck using a 4090 with 24gb of ram. The trick was to not give it ALL of your memory because it will need some for data load and some for the processing. Quantization helped some too.

Yup. This issue was before the Pythia base model was out. This should run on GPUs >12 GB VRAM or <12 GB VRAM with offloading to CPU/disk.

The GPT-NeoX-20B model still requires 40 GB of memory to be loaded.

This issue can be closed. Yes, the Pythia model can run inference on a single Nvidia RTX A4000.

I can confirm the Pythia model works on single Nvidia RTX A4000. I am figuring what I need for: GPT-NeoX-20B

GPU VRAM usage with Pythia model: 14595MiB / 16376MiB

Apr 04 '23 10:04 exander77

I can confirm the Pythia model works on single Nvidia RTX A4000. I am figuring what I need for: GPT-NeoX-20B

GPU VRAM usage with Pythia model: 14595MiB / 16376MiB

That's great! You'll either need >40 GB VRAM (~20 in 8-bit) for the GPT-NeoX-20B model, or use cpu offloading to run it on your A4000 by adding the flag -g 0:14 (there's a PR up that should let you increase the 14 to 16).

Apr 04 '23 14:04 orangetin

OpenChatKit OpenChatKit copied to clipboard

Does it run on single nVidia RTX A4000?

OpenChatKit
OpenChatKit copied to clipboard