frederickrohn
Results
2
comments of
frederickrohn
Hi, apologies for the basic question, I'm still a beginner with llama-cpp-python. Downloaded a 7b quantized small model directly from the website and put it in the working directory, loaded...
5 Tokens per second is pretty fast, that's a much better performance than what I was getting on the 8gb M1 (about 20 words so probably around 40 tokens in...