LLM integration proof-of-concept

Open PalmerAL opened this issue 11 months ago • 1 comments

This PR sets up the basic infrastructure to run an LLM inside Min using node-llama-cpp inside a utility process. Any llama.cpp-formatted model file should work; the model can be configured by updating modelPath inside llmService.mjs. My testing so far has been with either this model or this one.

My original intent with this was to see if it was possible to generate high-quality page summaries to display in the searchbar. Unfortunately, with llama-3.2-1b, the quality of the summaries seems quite poor. llama-3.2-3b does much better, but keeping the model loaded requires around 5GB of memory. I think this means that any use case that requires the model to continually be loaded in the background is infeasible, but it might work in a situation where the user explicitly requests to use it, which would allow us to load the model for a brief period of time and then immediately unload it. I'm planning to experiment with language translation (replacing the current cloud-based version) and with an explicit "summarize page" command, but if anyone has additional ideas for where this could be useful, I'd be happy to test them.

Jan 11 '25 21:01 PalmerAL