bitsandbytes icon indicating copy to clipboard operation
bitsandbytes copied to clipboard

Cannot use int8

Open RiverDong opened this issue 3 years ago • 1 comments

I tried to use 8xA100 to run BLOOM. But I cannot do load_in_8bit. I tried to follow the instruction here load the model by model = AutoModelForCausalLM.from_pretrained(model_name, device_map='auto', load_in_8bit=True, max_memory=max_memory) Basically, if I don't have max_memory=max_memory, then most memory would go the gpu:0 and then CUDA out of memory error. If I put max_memory=max_memory, it will throw 8-bit operation are not supported under CPU. Screen Shot 2022-08-13 at 10 45 09 PM

RiverDong avatar Aug 14 '22 05:08 RiverDong

Looking again at this error, I realize the problem is likely that you set the memory threshold too low in the max_memory. You are currently using 3 GB per GPU, for a total of 24 GB across 8 GPUs, but BLOOM needs ~180 GB of GPU memory. You can set it to ~36GB if you have A100 with 40 GB memory (or higher if you have the 80 GB ones).

We will fix the error message to note that this error appears if not enough memory is allocated for the GPU.

TimDettmers avatar Aug 14 '22 19:08 TimDettmers