nephos
nephos copied to clipboard
Execution of commands should be robust to timeouts on Helm side
Occasionally commands will fail with a Kubernetes/Helm timeout, we should have a robust mechanism to avoid breaking execution on a first fail (e.g. some kind of exponential backoff, or use execute_until_success with a limited number of tries.)
Can you please provide a method to reproduce this bug?
Typically, this relates to cases where networking within Kubernetes is happening. If I remember correctly, it relates to cases where we need to access the CA to create crypto-material, etc.