Alpaca-Turbo icon indicating copy to clipboard operation
Alpaca-Turbo copied to clipboard

Extremely slow respone

Open auxilio-ab opened this issue 1 year ago • 4 comments

After getting rid of all the other issues (see other issue tickets for "models could not be loaded due to localhost issue" and "only a specific model can be used") I finally managed to get alpaca-turbo running.

But if I type I question, it takes over 130 seconds to reply with only the fraction of a word. After around 210 seconds the first sentence was finally completed.

The docker image is running on a server with 32GB of RAM and 16 CPU cores. They are far from being stressed or so. (RAM usage 2,9 GB, CPU 25%).

What could be the issue?

auxilio-ab avatar Apr 06 '23 17:04 auxilio-ab

I've been able to get better (although still not great) response times by going into the settings (the gear thing), and increasing the self.threads from the default 4 (I think that's what it was) to 10-13. The responses then use a LOT more CPU power and the responses are (slightly) faster.

Still like 30-40 sec.

aalbrightpdx avatar Apr 07 '23 19:04 aalbrightpdx

I also tried making the thread count higher but it did not change much. The program generates like 2 words a minute. It uses ~60% CPU for like 20 seconds and then it drops to between 1% and 5%. There are no issues with cooling, I checked that. It also just uses 2 GB of ram. I use AMD Ryzen 7 5800x with 16 GB of RAM. I run it with docker on windows 11 with debian wsl.

bendeguzszkalka avatar Apr 08 '23 22:04 bendeguzszkalka

Amazed it's running at all, so thank you! But yes, it's unusably slow even when I give it more threads.

Model Name: MacBook Pro Model Identifier: MacBookPro17,1 Model Number: MJ123LL/A Chip: Apple M1 Total Number of Cores: 8 (4 performance and 4 efficiency) Memory: 16 GB

wolfmcnally avatar Apr 09 '23 00:04 wolfmcnally

same here...very slow response 120-300sec. specs:

  • intel xeon 12 cores
  • 32GB RAM
  • SSD

I've changed Threads too but it's just saving a few seconds. Still generating word for word after seconds ;-) Still great work!!

BangerTech avatar Apr 18 '23 05:04 BangerTech