crush icon indicating copy to clipboard operation
crush copied to clipboard

Openrouter models stop at random points

Open paperbenni opened this issue 4 months ago • 5 comments

Describe the bug

Kimi free model on openrouter just stops at random points. It says it will do something, start doing it and then at random points stop giving output until reprompted I do realize the free models on openrouter likely use leftover computing capacity and are heavily rate limited. That said, Roo Code, Cline and Kilo Code do not have that issue. My guess is that there is some more sophisticated retry logic to work around that issue. I did see similar behavior with paid models like Gemini a while back, so it's not entirely limited to free models. I haven't spend a lot of time looking into it, but my best guess is that Openrouter rate limiting or provider switching is weird and clients need to do something to adapt. Zed has similar issues with openrouter

Setup

Arch Linux crush version v0.0.0-20250813204448-4fafd053208f openrouter free key

To Reproduce

  1. Auth on openrouter
  2. Choose Kimi K2: Free
  3. Instruct the agent to do something non-trivial
  4. Agent just stops after a few seconds without any explanation or error message

Expected behavior

The model continually does things until it is appropriate to stop

Screenshots

Image

paperbenni avatar Aug 14 '25 09:08 paperbenni

can you check what's going on in crush logs -f? might be useful to start crush in debug mode (crush --debug)

caarlos0 avatar Aug 14 '25 19:08 caarlos0

Had similar issues as well with different models. I can get it to resume using something like

Continue where you left off.

But if i try a shorter prompt like Continue it doesn't always work.

Maybe it would be best to send an API call where we ask the model Are you done? and only end the api call chains if it responds with Yes

It could probably be templated better; but i hope that describes it.

akumaburn avatar Aug 14 '25 22:08 akumaburn

Yes it happens with lot of models where they stop so building verification system where after each final message there will be another call to check whether the task is really completed if not it will continue automatically.

With claude code as well where there are lot of task it stop in between saying finished 5 todos out of 10. So built todo system with automatic verification and continue without user input

naresh-tech-backend avatar Aug 16 '25 11:08 naresh-tech-backend

Some of the screenshot for the reference. It has both todo and auto continue loop

Image Image

naresh-tech-backend avatar Aug 16 '25 14:08 naresh-tech-backend

I'm seeing this with Cerebras Code and Synthetic.new now too. It initially started happening with Kimi K2 Thinking on Synthetic. Now it's happening with GLM-4.6 in both Cerebras and Synthetic. Randomly stops without any indication as to why. I haven't run into it yet with Sonnet 4.5.

When this happens, the thinking bar will often freeze, so this seems to be a crush issue rather than an inference provider issue.

divmain avatar Nov 19 '25 21:11 divmain