ChatGPT-at-Home icon indicating copy to clipboard operation
ChatGPT-at-Home copied to clipboard

8bit model

Open zshobbs opened this issue 2 years ago • 0 comments

Run the LLM's over multiple GPUS Using 8bit models to compress the vram footprint. "facebook/opt-30b" runs on 2 nvidia rtx 3090's. "facebook/opt-66b" might squeeze onto bigger GPUs or you can use float16 to and CPU or nvme/ssd offload.

This uses Huggingface accelerate and bitsandbytes.

zshobbs avatar Jan 25 '23 23:01 zshobbs