starcoder Is finetune.py incompatible with older GPUs?

Is finetune.py incompatible with older GPUs?

Open umm-maybe opened this issue 1 year ago • 0 comments

trafficstars

Hi, while running on a Colab A100 instance I noticed that the VRAM consumed by finetune.py was only about 5 GB for starcoderbase-1b so I attempted it on my local machine which has a GTX 1070 card (8 GB VRAM, Pascal architecture). This didn't work, and I got a similar error when attempting again with either starcoderbase-1B or starcoderbase-3B on a larger, but still older GPU (NVIDIA Quadro P6000; 24GB VRAM). Here is the error:

RuntimeError: expected mat1 and mat2 to have the same dtype, but got: c10::Half != float

At first I thought this might be due to some difference in architecture (Pascal vs. Ampere) but this is contradicted by the fact that I have a Kaggle Code notebook which can fine-tune Starcoder with two P100 GPUs, which is also Pascal.

Is there some other explanation for this?

Longer stacktrace attached. dump.txt

Mar 19 '24 13:03 umm-maybe

starcoder starcoder copied to clipboard

Is finetune.py incompatible with older GPUs?

starcoder
starcoder copied to clipboard