alpaca.cpp
alpaca.cpp copied to clipboard
Can this be interacted with in API-like request way?
I want to be able to send individual requests from a separate program to an instance of Alpaca running just like can be done with the OpenAI API. Is this currently possible?
+1, Also with this maybe we could keep track of the old conversations and feed them to Alpaca, so that context becomes better
This is my current way to call it from the simulated terminal (node-pty)
https://github.com/linonetwo/langchain-alpaca/blob/1c519c206ea5152269c3d82d160a2567bbc9e69f/src/index.ts#L222-L282
I also hope there can be a way to deep integrate it, so it won't need to restart on every request!
+1 REST API would be helpful.
+1 I'm a beginner and I've been trying to use python to call "chat.exe" using the subprocesses library but it doesn't seem to be working. I want to be able to make it remember convos and then make it voice activated using python.
Agree, even simple non-interactive way to communicate for example trough files or pipes would be nice.
Yes, or maybe simple tcp socket
I feel I might help, I'm playing with alpaca.cpp here to have it with an API, more or less openAI style: https://github.com/go-skynet/llama-cli and seems to work, happy to contribute it somewhere else. I just really wanted to have it programmatically accessible via pipes/API to use it for automation.
@mudler Can you explain how did you keep it running, so we can port this technique to Py and TS in LangChain.
I guess you keep interaction mode running in memory, and reuse the instance? That is what I want to try next.
@mudler Can you explain how did you keep it running, so we can port this technique to Py and TS in LangChain.
Sure thing, I've slightly modified the original lama.cpp to keep the result in memory and return it back, code with golang bindings it's here: https://github.com/go-skynet/llama
I guess you keep interaction mode running in memory, and reuse the instance? That is what I want to try next.
yup correct, I actually pruned the interactive code completely to keep it simpler
I feel I might help, I'm playing with alpaca.cpp here to have it with an API, more or less openAI style: https://github.com/go-skynet/llama-cli and seems to work, happy to contribute it somewhere else. I just really wanted to have it programmatically accessible via pipes/API to use it for automation.
does your code work on llama.cpp, isn't that different from the fine tuned alpaca model?
I feel I might help, I'm playing with alpaca.cpp here to have it with an API, more or less openAI style: https://github.com/go-skynet/llama-cli and seems to work, happy to contribute it somewhere else. I just really wanted to have it programmatically accessible via pipes/API to use it for automation.
does your code work on llama.cpp, isn't that different from the fine tuned alpaca model?
When doing this I've walked the changes with alpaca.cpp too, so should be compatible, I've actually tried it only with alpaca models as I don't have bigger hardware to test it with. Also it is on pair with the original lama.cpp which supports alpaca.
Update: I've managed to run also 13b and 30b alpaca models with the new tokenizer, will update later today
Implemented this relatively simple python-based solution if that's helpful to anyone: https://github.com/muelletm/alpaca.py.
@muelletm I tried it. At first, it gave me JSON Decode Error. Then I did a try-catch. After that, the code just stops completely.
@robin-coac Can you open an issue in https://github.com/muelletm/alpaca.py and share how you start the API and how you call it?
Hi @muelletm I create an issue.