starcoder
starcoder copied to clipboard
OOM on T4 inference
trafficstars
tokenizer = AutoTokenizer.from_pretrained(checkpoint) model = AutoModelForCausalLM.from_pretrained(checkpoint).to(device)
Does anyone could help me to resolve the problem ,T4 has almost 15G GPU memory
And if I use:
with torch.device(device): model = AutoModelForCausalLM.from_pretrained(checkpoint, torch_dtype=torch.float16)
There is another issue
StarCoder will probably not fit on a T4, we added some hardware requirements section here, 8bit mode takes almost 16GB of RAM.