Client disconnected. Stopping generation
When I call the API using Dify, the generated answer gets truncated halfway through. Both the model context and the tokens are long enough. Why is it truncated?
2025-03-21 19:44:07 [INFO]
[LM STUDIO SERVER] Running chat completion on conversation with 2 messages.
2025-03-21 19:44:07 [INFO]
[LM STUDIO SERVER] Streaming response...
2025-03-21 19:44:07 [INFO]
[LM STUDIO SERVER] First token generated. Continuing to stream response..
2025-03-21 19:44:33 [INFO]
Finished streaming response
2025-03-21 19:44:33 [INFO]
[LM STUDIO SERVER] Client disconnected. Stopping generation... (If the model is busy processing the prompt, it will finish first.)
Having the same issue. Not sure if it’s browser base. Running server on M4 Pro, through SillyTavern, on both iOS and MacOS Safari browsers.
I discovered that LM Studio also encountered this issue.
Mac Studio Apple M3 Ultra
DIFY 1.4.0 1.4.1
LM Studio 0.3.16
2025-06-11 00:04:25 [INFO] [LM STUDIO SERVER] Running chat completion on conversation with 1 messages. 2025-06-11 00:04:25 [INFO] [LM STUDIO SERVER] Streaming response... 2025-06-11 00:04:25 [DEBUG] [CacheWrapper][INFO] Trimmed 1992 tokens from the prompt cache 2025-06-11 00:09:25 [INFO] [LM STUDIO SERVER] Client disconnected. Stopping generation... (If the model is busy processing the prompt, it will finish first.) 2025-06-11 00:12:48 [INFO] Finished streaming response
Referencing this issue that might be related:
https://github.com/RooCodeInc/Roo-Code/issues/6521