llama.cpp icon indicating copy to clipboard operation
llama.cpp copied to clipboard

Break down main function in llama-server

Open ericcurtin opened this issue 8 months ago • 2 comments

llama-server main function is getting meaty, just breaking it down into smaller functions.

ericcurtin avatar May 10 '25 12:05 ericcurtin

Incomplete

ericcurtin avatar May 10 '25 12:05 ericcurtin

Before going further, I think it's better to discuss a plan rather than diving into the code.

While working on https://github.com/ggml-org/llama.cpp/pull/13400#issuecomment-2866290941 , I also thought about refactoring server.cpp into small components, this should be done in a way that is easy to enable routing requests to multiple models on the same server instance.

For now, the most simple task is of course to abstract out the creation of HTTP server. Second task could be to move all the HTTP handler to a completely separated file. The main component, server_context may also need to be moved to a dedicated file.

ngxson avatar May 10 '25 12:05 ngxson