OpenChatKit
OpenChatKit copied to clipboard
Can I fine tune GPT-Neo-XT-Chat-Base-20B with 8 A100?
Can you introduce the computing resources needed for the experiment
Similar question, how to infer the model on 4x V100?
Similar question, what is the minimum VRAM requirement to finetune the model? How about 4*4090?
I would really like to know this too, it should probably be in the readme. I have 1 3090 to stand this up with before I can ask for more resources. If its really big, I might try to scale the model down and submit a request for a mini model to do sanity checks on local systems and such.
Similar question, what is the minimum requirement to finetune the model if I want to add my own docs?
We train this model on 8x A100 80GB GPUs. I'll update the README.
I... submit a request for a mini model to do sanity checks on local systems and such
This is a great idea! Will keep this issue open to track adding such a model.
Can I train it on a single or fewer A100 80GB GPUs? Maybe it takes more time or it cannot run?
Can I finetune the model on 8X V100 32GB GPUS with a smaller batch size?
Can I train it on a single or fewer A100 80GB GPUs? Maybe it takes more time or it cannot run?
up
We train this model on 8x A100 80GB GPUs. I'll update the README.
I... submit a request for a mini model to do sanity checks on local systems and such
This is a great idea! Will keep this issue open to track adding such a model.
how long it takes to train on 8*A100?
About an hour per 100 steps. Usually, we fine-tune for a couple days.