nano-vllm
nano-vllm copied to clipboard
Can I use the CPU for inference?
I want to use the CPU for inference. Can it work? Is it possible to not install flash-attn?