starcoder OOM on T4 inference

OOM on T4 inference

Open hellangleZ opened this issue 2 years ago • 1 comments

trafficstars

tokenizer = AutoTokenizer.from_pretrained(checkpoint) model = AutoModelForCausalLM.from_pretrained(checkpoint).to(device)

Does anyone could help me to resolve the problem ,T4 has almost 15G GPU memory

And if I use:

with torch.device(device): model = AutoModelForCausalLM.from_pretrained(checkpoint, torch_dtype=torch.float16)

There is another issue

May 14 '23 12:05 hellangleZ

StarCoder will probably not fit on a T4, we added some hardware requirements section here, 8bit mode takes almost 16GB of RAM.

May 25 '23 16:05 loubnabnl