redash icon indicating copy to clipboard operation
redash copied to clipboard

Queries stuck after worker failures

Open igorcalabria opened this issue 3 years ago • 1 comments

Issue Summary

Sometimes queries may be get stuck "running" forever (until redis expiration) after worker failures in redash 10. Running the query again with no changes has no effect as redash thinks the job is already running.

Steps to Reproduce

I'm not sure if these steps are deterministic but I had great success in our pre-production environment.

  1. Refresh a dashboard with several queries that takes more than a few seconds to run
  2. Forcibly kill the worker process

Some of the queries on the dashboard are now in this stuck state.

Technical details:

In the network tab, you can see the redash pinging the job endpoint and getting the query in "Started" state. I double checked if the "remove_ghost_locks" was running and from the logs it didn't remove these queries. It seems the main issue is rq, looking at the recent changelogs, there a bunch of improvements to error handling. Upgrading rq to rq==1.10.1 seemed to fixed this issue.

  • Redash Version: 10.1
  • Browser/OS: Firefox/Linux
  • How did you install Redash: docker on kubernetes

igorcalabria avatar Jul 21 '22 11:07 igorcalabria

Related to #5797

We plan to update rq to gather some of these benefits. Thank you for reporting your experience that updating to 1.10.1 seemed to fix the issue for you 👌

susodapop avatar Jul 21 '22 23:07 susodapop