lms Client disconnected. Stopping generation

When I call the API using Dify, the generated answer gets truncated halfway through. Both the model context and the tokens are long enough. Why is it truncated?

2025-03-21 19:44:07  [INFO] 
[LM STUDIO SERVER] Running chat completion on conversation with 2 messages.
2025-03-21 19:44:07  [INFO] 
[LM STUDIO SERVER] Streaming response...
2025-03-21 19:44:07  [INFO] 
[LM STUDIO SERVER] First token generated. Continuing to stream response..
2025-03-21 19:44:33  [INFO] 
Finished streaming response
2025-03-21 19:44:33  [INFO] 
[LM STUDIO SERVER] Client disconnected. Stopping generation... (If the model is busy processing the prompt, it will finish first.)

Mar 21 '25 11:03 Cyydz

Having the same issue. Not sure if it’s browser base. Running server on M4 Pro, through SillyTavern, on both iOS and MacOS Safari browsers.

Apr 09 '25 01:04 ReitoKanzaki

I discovered that LM Studio also encountered this issue.

Mac Studio Apple M3 Ultra

DIFY 1.4.0 1.4.1

LM Studio 0.3.16

2025-06-11 00:04:25 [INFO] [LM STUDIO SERVER] Running chat completion on conversation with 1 messages. 2025-06-11 00:04:25 [INFO] [LM STUDIO SERVER] Streaming response... 2025-06-11 00:04:25 [DEBUG] [CacheWrapper][INFO] Trimmed 1992 tokens from the prompt cache 2025-06-11 00:09:25 [INFO] [LM STUDIO SERVER] Client disconnected. Stopping generation... (If the model is busy processing the prompt, it will finish first.) 2025-06-11 00:12:48 [INFO] Finished streaming response

Jun 10 '25 16:06 lhtpluto

Referencing this issue that might be related:

https://github.com/RooCodeInc/Roo-Code/issues/6521

Aug 06 '25 17:08 dabockster