llama-cpp-python Models loading on CPU instead of GPU after updating version

I updated my version because some DeepSeek models were not working loading; after updating, they started loading, but only on CPU. I tried with other older models on my system that used to load on GPU, and they started only loading onto CPU as well. I noticed this line in particular that others have mentioned for the same issue: tensor 'token_embd.weight' (q4_K) (and 322 others) cannot be used with preferred buffer type CPU_AARCH64, using CPU instead

I downgraded the version to 0.3.6 and it loads onto my GPU now.

I can just use the older version but it would be nice if this gets fixed so that those of us with this issue aren't locked out of newer versions.