llama.cpp
llama.cpp copied to clipboard
feature request, restful api / exposure
hi team,
was playing interactive mode for couple hours, pretty impressive
resides what's mentioned in #145 , it might be not too far, to plug this a endpoint / functional call ( like swig or socket or openapi to replace current stdin ?, then self-host can have a very powerful new residents, like i got a powerful PC at home to be personal assist
also found that -n
is the context / token limit, would be great if engine can start with 0 presume context ( which is to lift off / decouple a bit from stdin
kindly let me know if there are directions or others interested in this ( also a developer here but not so C / tensor flavored ( as without advice, force hi-jack stdin / stdout seems stupid
seems in progress ? like what's mentioned in https://github.com/ggerganov/llama.cpp/issues/122#issuecomment-1467525769 , so both api or ipython like conversational can happen?. any feature branch for the binding work , @gjmulder , then i may close this as duplicate. ( or to join the collabaration in same thread
seems mentioned in #82 , whether a WIP or existing binding handler
Marked as duplicate