llama2-webui
llama2-webui copied to clipboard
Very slow generation
I am running this on Mac M1 16GB RAM using app.py
for simple text generation. Using the llama.cpp
from terminal is much faster but when I use the backend through app.py
is very slow. Any ideas?