dotnet-operator-sdk icon indicating copy to clipboard operation
dotnet-operator-sdk copied to clipboard

[bug]: ResourceWatcher race condition on `StopAsync()` and `OnError()`

Open PSanetra opened this issue 2 years ago • 0 comments

Describe the bug

It seems like there is a race condition between ResourceWatcher.OnError() and ResourceWatcher.StopAsync().

To reproduce

  1. ResourceWatcher.StartAsync()
  2. Wait for ResourceWatcher to be connected
  3. Create network issue (e.g. reset_peer with toxiproxy)
  4. ResourceWatcher.StopAsync()

StopAsync() will put the ResourceWatcher into a stop state, but OnError() will try to restart the ResourceWatcher until it succeeds and the watcher will actually never stop.

Expected behavior

Calling StopAsync should guarantee that the ResourceWatcher stops and will never emit any new event until StartAsync() is called again. This is a critical requirement for the LeaderAwareResourceWatcher.

Screenshots

No response

Additional Context

Version: 8.0.0-pre.29

PSanetra avatar Dec 04 '23 16:12 PSanetra