Results 122 comments of Daiki Ueno

A slightly better approach might be to communicate through a Unix domain socket, given the latest llama.cpp [got](https://github.com/ggml-org/llama.cpp/pull/12613) support for it in llama-server. Then rag_framework can be modified as below,...

@rhatdan yes, the latest container [image](https://quay.io/repository/ramalama/ramalama/manifest/sha256:51892a55dbbf6b9c117e26ef8e234b5db331edcc99e10095f00084112f1bdf95) seems to include the feature.