1aienthusiast

Results 4 issues of 1aienthusiast

The generation takes more time with each message, as if there's an overhead For example: The second response is 11x faster than the last response. They have the same number...

### Describe the bug Almost every message is marked as 200 tokens regardless if it's 1 word or multiple words/sentences. ### Is there an existing issue for this? - [X]...

bug

First things first, I want to thank You for all the contributions You've done to the open source community! I've implemented this repo in my [MusicGen WebUI](https://github.com/1aienthusiast/audiocraft-infinity-webui/), but I need...

cuda: 35tokens/s triton: 5tokens/s I used ooba's webui only for cuda, because I've been unable to get triton to work with ooba's webui, I made sure i used the same...