hydra icon indicating copy to clipboard operation
hydra copied to clipboard

hydra-queue-runner doesn't reconnect after a postgresql server restart, staying stuck forever

Open delroth opened this issue 1 year ago • 0 comments

Describe the bug

After we restarted the PostgreSQL database for hydra.nixos.org, the queue runner just got stuck logging it lost its connection and never reconnected on its own. I had to manually restart it.

Expected behavior The queue runner properly and gracefully recovers from a temporary PostgreSQL connection failure.

Hydra Server:

Please fill out this data as well as you can, but don't worry if you can't -- just do your best.

  • Version of Hydra: 00d30874da759eb0f44f446415b2469920ff41b5
  • Version of Nix Hydra is built against: from flake

Additional context

Logged forever every 10s:

Jan 12 22:14:13 rhea hydra-queue-runner[4016980]: main thread: Lost connection to the database server.
Jan 12 22:14:13 rhea hydra-queue-runner[4016980]: queue monitor: Lost connection to the database server.

delroth avatar Jan 12 '24 22:01 delroth