Niek van der Maas
Niek van der Maas
`llama-cpp-python` has a Docker image now: https://github.com/abetlen/llama-cpp-python/pkgs/container/llama-cpp-python
Thanks! `llama-cli` with the API addition sounds like a great match with ChatGPT-web! The models don't work because we hard-code explicit supported models: https://github.com/Niek/chatgpt-web/blob/1926f7df15b5bf099d1f0ad29740d35c98cfbbdf/src/lib/Types.svelte#L2-L9 This can be quite easily fixed...
We have this ability with an env var, but I assume you want to have it configurable in the web interface?
Seems like the API is pretty similar: https://readme.fireworks.ai/docs/openai-compatibility So you could try by setting the `VITE_API_BASE` env var.
There are some approaches to work around the token limit: * Truncate the conversation by removing old messages, as you proposed * Summarize the conversation in a new API call,...
There is a interesting compression approach described here: https://twitter.com/VictorTaelin/status/1642664054912155648
One thing to note is that the "compression" and "decompression" is a lot more consistent if you set the temperature to 0, meaning more deterministic and less random output. In...
I'm thinking about this too... it should be possible to make a "browsing" plugin using function calling and puppeteer exposed witha simple JSON API. But pretty risky overall. Another (more...
Ah cool, I didn't see the other repo! The extension model is great for casual tasks, but when operating at scale you can't really work without CDP/remote browsers. Speaking as...
This would also be a great feature to support changelog creation.