[BUG]: LMStudio provided LLM stopped working with Anything LLM after upgrading to 1.9.0
How are you running AnythingLLM?
Desktop
What happened?
I had 1.8.5-r2 Anything LLM desktop installed. It worked properly with LM Studio as LLM and embedding model provider simultaneously. During my attempt to add an MCP server to Anything LLM I upgraded to the latest version (1.9.0). Although my configuration remained intact the system stopped working: it was possible to select an LLM, so there was some sort of communication between LMStudio and Anything LLM but once I started to chat I got the following mysterious error message: "An error occurred while streaming response. network error". Once this message appeared Anything LLM stopped being able to list LM Studio models at all until the application was restarted. After I managed to fix my MCP configuration issue (full path was required in the command field for proper operation) I had to reinstall the older 1.8.5-r2 version of Anything LLM to get it working properly with LLM Studio as LLM and embedder provider as well. So now the old version works properly and it is able to use the MCP server as well in agent mode (@agent). Please, check the reason of functional degradation with the upgrade before too many users run into the same issue. My platform is a Mac Studio M3 Ultra if it counts.
Are there known steps to reproduce?
No response
I have tested this on the latest version of AnythingLLM (1.9.0) and am unable to replicate the issue. If you are able to consistently replicate the issue, please provide me with exact instructions on how you're able to get it to do this on your setup.
Also, please provide me with what models you're using for the embedder and LLM inside of LMStudio since this may help us narrow down possibilities.
Update: Downgraded to 1.8.5-r2 and don't see the problem anymore
Also running macOS Tahoe 26.1 on a Mac Studio M3 Ultra.
I'm having a similar problem. After upgrading to 1.9.0, using AnythingLLM will sometimes work and sometimes not work. Here's a log of LM Studio of it working and then I continue the chat on AnythingLLM and it fails: 2025-10-23 14:05:18 [INFO] Returning { "data": [ { "id": "qwen3-235b-a22b-mlx", "object": "model", "type": "llm", "publisher": "LibraxisAI", "arch": "qwen3_moe", "compatibility_type": "mlx", "quantization": "5bit", "state": "loaded", "max_context_length": 40960, "loaded_context_length": 40960, "capabilities": [ "tool_use" ] }, { "id": "text-embedding-nomic-embed-text-v1.5", "object": "model", "type": "embeddings", "publisher": "nomic-ai", "arch": "nomic-bert", "compatibility_type": "gguf", "quantization": "Q4_K_M", "state": "not-loaded", "max_context_length": 2048 }, { "id": "qwen/qwen3-235b-a22b-2507", "object": "model", "type": "llm", "publisher": "qwen", "arch": "qwen3_moe", "compatibility_type": "mlx", "quantization": "6bit", "state": "not-loaded", "max_context_length": 262144, "capabilities": [ "tool_use" ] }, { "id": "qwen/qwen3-1.7b", "object": "model", "type": "llm", "publisher": "qwen", "arch": "qwen3", "compatibility_type": "mlx", "quantization": "4bit", "state": "not-loaded", "max_context_length": 40960, "capabilities": [ "tool_use" ] }, { "id": "qwen3-235b-a22b", "object": "model", "type": "llm", "publisher": "unsloth", "arch": "qwen3moe", "compatibility_type": "gguf", "quantization": "Q5_K_S", "state": "not-loaded", "max_context_length": 40960, "capabilities": [ "tool_use" ] } ], "object": "list" } 2025-10-23 14:05:19 [INFO] A value was passed to the 'Authorization' header, but the server is not configured to authenticate via API token. The 'Authorization' header will be ignored. 2025-10-23 14:05:19 [INFO] A value was passed to the 'Authorization' header, but the server is not configured to authenticate via API token. The 'Authorization' header will be ignored. 2025-10-23 14:05:19 [INFO] [LM STUDIO SERVER] Running chat completion on conversation with 8 messages. 2025-10-23 14:05:19 [INFO] [LM STUDIO SERVER] Streaming response... 2025-10-23 14:06:02 [INFO] Finished streaming response 2025-10-23 14:06:22 [INFO] Returning { "data": [ { "id": "qwen3-235b-a22b-mlx", "object": "model", "type": "llm", "publisher": "LibraxisAI", "arch": "qwen3_moe", "compatibility_type": "mlx", "quantization": "5bit", "state": "loaded", "max_context_length": 40960, "loaded_context_length": 40960, "capabilities": [ "tool_use" ] }, { "id": "text-embedding-nomic-embed-text-v1.5", "object": "model", "type": "embeddings", "publisher": "nomic-ai", "arch": "nomic-bert", "compatibility_type": "gguf", "quantization": "Q4_K_M", "state": "not-loaded", "max_context_length": 2048 }, { "id": "qwen/qwen3-235b-a22b-2507", "object": "model", "type": "llm", "publisher": "qwen", "arch": "qwen3_moe", "compatibility_type": "mlx", "quantization": "6bit", "state": "not-loaded", "max_context_length": 262144, "capabilities": [ "tool_use" ] }, { "id": "qwen/qwen3-1.7b", "object": "model", "type": "llm", "publisher": "qwen", "arch": "qwen3", "compatibility_type": "mlx", "quantization": "4bit", "state": "not-loaded", "max_context_length": 40960, "capabilities": [ "tool_use" ] }, { "id": "qwen3-235b-a22b", "object": "model", "type": "llm", "publisher": "unsloth", "arch": "qwen3moe", "compatibility_type": "gguf", "quantization": "Q5_K_S", "state": "not-loaded", "max_context_length": 40960, "capabilities": [ "tool_use" ] } ], "object": "list" } 2025-10-23 14:06:27 [INFO] A value was passed to the 'Authorization' header, but the server is not configured to authenticate via API token. The 'Authorization' header will be ignored. 2025-10-23 14:06:27 [INFO] A value was passed to the 'Authorization' header, but the server is not configured to authenticate via API token. The 'Authorization' header will be ignored. 2025-10-23 14:06:27 [INFO] [LM STUDIO SERVER] Running chat completion on conversation with 10 messages. 2025-10-23 14:06:27 [INFO] [LM STUDIO SERVER] Streaming response... 2025-10-23 14:07:02 [INFO] Finished streaming response 2025-10-23 14:08:19 [INFO] Returning { "data": [ { "id": "qwen3-235b-a22b-mlx", "object": "model", "type": "llm", "publisher": "LibraxisAI", "arch": "qwen3_moe", "compatibility_type": "mlx", "quantization": "5bit", "state": "loaded", "max_context_length": 40960, "loaded_context_length": 40960, "capabilities": [ "tool_use" ] }, { "id": "text-embedding-nomic-embed-text-v1.5", "object": "model", "type": "embeddings", "publisher": "nomic-ai", "arch": "nomic-bert", "compatibility_type": "gguf", "quantization": "Q4_K_M", "state": "not-loaded", "max_context_length": 2048 }, { "id": "qwen/qwen3-235b-a22b-2507", "object": "model", "type": "llm", "publisher": "qwen", "arch": "qwen3_moe", "compatibility_type": "mlx", "quantization": "6bit", "state": "not-loaded", "max_context_length": 262144, "capabilities": [ "tool_use" ] }, { "id": "qwen/qwen3-1.7b", "object": "model", "type": "llm", "publisher": "qwen", "arch": "qwen3", "compatibility_type": "mlx", "quantization": "4bit", "state": "not-loaded", "max_context_length": 40960, "capabilities": [ "tool_use" ] }, { "id": "qwen3-235b-a22b", "object": "model", "type": "llm", "publisher": "unsloth", "arch": "qwen3moe", "compatibility_type": "gguf", "quantization": "Q5_K_S", "state": "not-loaded", "max_context_length": 40960, "capabilities": [ "tool_use" ] } ], "object": "list" }
Thank you for this info, we are investigating this :)
what version of LMStudio is in use here?
I have tested this on the latest version of AnythingLLM (1.9.0) and am unable to replicate the issue. If you are able to consistently replicate the issue, please provide me with exact instructions on how you're able to get it to do this on your setup.
Also, please provide me with what models you're using for the embedder and LLM inside of LMStudio since this may help us narrow down possibilities.
I use
- LMStudio 0.3.30
- Embedding model: text-embedding-qwen3-embedding-8b, which I really suggest as it really finds the needle in the haystack
- LLM model: I cannot exactly remember because I switch them often but mistralai/magistral-small-2509 was among them Even if I switched back to the LLM shipped with Anything LLM it was not responding.
Platform: Mac Studio M3 Ultra 512GB OS: macOS Tahoe 26.0.1
It looks like something broke after the first attempt to chat after upgrade, so even model selection becomes impossible until restarting Anything LLM 1.9.0.
@Ackerka @nhaneezy When you get this generic error inside AnythingLLM, can you please provide us with the LM Studio developer logs?
I have been testing this with various mcp servers and found that some mcp severs like Desktop Commander have many tools which causes the system prompt to be massive and sometimes causes issues with context overflow in specific models like the mistralai/magistral-small-2509 that was mentioned.
I just want to make sure that this bug I am getting is the same thing that you are experiencing.
I also discovered that there may be a race condition bug that was introduced in #4468 when we try to get context window sizes. It's possible that if LMStudio isn't running or responds slowly, the provider fails to initialize properly and crashes the model loading. Any logs you can provide from LM Studio would be very helpful to ensure we get this fixed.
in my initial response above I pasted the LM Studio developer logs. I realize it may not be too illuminating but that's what's in the logs after the initial successful chat entered and then the following chat entered which ends in the generic error on AnythingLLM