serge
serge copied to clipboard
LangChain integration
Pretty low hanging fruit with the wrapper we have, would be great to create a custom LangChain LLM wrapper for llama.cpp
.
Then we could use it in the API and do all sorts of cool things with Serge.
Hi, I have been planning to work on this and was wondering if there was a way to just run the API server.
additionally stop sequences seem to be an issue on the API side.
Hey! Happy to hear you want to tackle this task.
Currently you can't run just the API server. This used to be possible but now that the API & the web server are behind nginx, if you start nginx without the web server it will fail the health check and refuse to start. Shouldn't be too hard to fix though hopefully, I'll have a look at it. In the meanwhile you can still access the API at http://localhost:8008/api/docs.
But regarding the LangChain integration I was thinking that it would also be interesting to make a custom LLM that is a wrapper calling this generate
method here.
https://github.com/nsarrazin/serge/blob/b5ff9d154142ca918347604e0fd89dd3b003fab0/api/utils/generate.py#L16
The custom LLM would only be working inside of the api
container (as it depends on the llama
binary) but this would still allow us to do cool stuff on the front-end of this project.
For interfacing with other projects indeed you will need to run the API server and make a custom LLM for that one.
Did you looked at this repository @nsarrazin ? Seems good.
https://github.com/linonetwo/langchain-alpaca