David
David
I had another issue mentioned on Reddit, where they reported Whisper 'hallucinations'. This makes me think that the choice of microphone is important. I really would hesitate in trying to...
We now use the espeak-ng binary, as it resolves the segfault issues, so I am closing this,
I'm very happy to incorporate the smaller changes. Thanks for the help! Just a few small points: On the hallucinations/VAD sensitivity - I've never had these issues. I saw someone...
Yes, I know, it's a small change. What I'm thinking about more is removing the Llama.cpp server code completely (llama.py), and letting users use a third-party API (Groq or OpenAI)...
ooba is also a potential approach, but that would mean deciding on which endpoint to use in the configuration. With llama.cpp[server], I noticed the Llama-3 chat template didn't work correctly,...
> If i remember correctly, ollama has one of the easiest learning curves, and uses very simple POST requests for chatting. It can be set up on practically all OS's,...
The only issue that needs to be addressed is in how chat history is processed. Until recently, Llama-3 chat prompting was broken, and llama.cpp[server] does not handle most chat formats....
> To me, the key is that there are lots of ways to use a third-party OpenAI-compatible API, or to run an OpenAI-compatible API locally, including llama.cpp, kobold.cpp, ollama, KoboldAI,...
I just pushed some big changes that overlap with some of the changes here, such as: - YAML config file - separating GLaDOS from the LLM - Use any LLM...
I will close this PR, as I understand that most of the features are now in main. I would welcome small PRs that cover the other features though!