sdk-core
sdk-core copied to clipboard
[Feature Request] Improve DX for when Core encounters server / network errors.
In #202 we tied the logging of retries to the retry config's max_retries.
I feel like a better experience would be to start logging immediately when failed to connect to the server.
[WARN] Failed to connect to Temporal server at {address}, trying again in 5 seconds...
[WARN] Failed to connect to Temporal server at {address}, trying again in 10 seconds...
[INFO] Connected to Temporal server at {address}
When the connection is lost we should also log
[WARN] Lost connection to Temporal server at {address}
Initially the PR made it so namespace not found errors are retried by default, a decision which was reverted before it was merged, as it might be an issue when users deploy to production. Their Workers will seem healthy while in practice they are standing idle and unable to progress. I see the value of retrying in test scenarios but I'm hesitant to have this as the default SDK behavior.
I asked @Sushisource what the go SDK does in this case and he said that go retries (almost?) any non-retriable error. We should list which non-retriable errors we actually want to retry and make this behavior consistent cross our SDKs.