youwot comments

Repositories
Issues
Comments

Results 1 comments of


                                            youwot

Exllama does not work with cpu only

I am pretty sure exllama only works for gpu models. "A standalone Python/C++/CUDA implementation of Llama for use with 4-bit GPTQ weights, designed to be fast and memory-efficient on modern...