Add cpu support
Your project is really good! But I want to inference the mamba model on cpu. Can you help me write it? Or I can write it by myself too.
Hello, I agree it would be great to be able to make inference on cpu
And maybe Intel Arc support, too? I don't know much about A.I., but it seems that this software is using PyTorch (says something about "patching" PyTorch for the note for AMD GPU's), and as far as I know, there is Intel extension for PyTorch or something that allows PyTorch to utilise Intel Arc GPU's.
My A770 has 16GB VRAM, so it should have been for running A.I. stuff, but in reality, I am sick and tired of software not supporting Intel Arc.
CPU inference would also allow to close even external issues 😉 https://github.com/vllm-project/vllm/issues/16920
pip install fails with errors like
NameError: name 'bare_metal_version' is not defined.
It would be also amazing add "mps" support. Amazing project!!!