nebuly
nebuly copied to clipboard
[Chatllama] training 20b model hardware requirements
I saw the hardware requirement for training chat-llama
13B to 20B → 8x Nvidia A100 (80Gb)
but check this article from HF where they show how to do it with a single 4090
https://huggingface.co/blog/trl-peft
Can this method be used with chat-llama? (ie using 8-bit, and trainable adapters)
Hi @ehartford, thank you for reaching out. Support for PEFT is clearly one of our short-term goals. In addition, we are working to improve integration with DeepSpeed and support weight offloading for all supported models. In this way we intend to allow users to train their chatllama model theoretically on any GPU.