Wait for cloud-init on dstack-gateway before attempting any operations
Current
After connecting to dstack-gateway via SSH, dstack-server will attempt updating the gateway with update.sh or configuring it by calling the /api/config endpoint. However, dstack-gateway's installation and setup with cloud-init may be unfinished by that moment yet. This would lead to unclear dstack-server errors like
Failed to configure gateway 35.202.8.178: ReadError(‘’)
or
Failed to update gateway 35.202.8.178: /bin/sh: 0: cannot open dstack/update.sh: No such file
Proposed
- After establishing each SSH connection to dstack-gateway ensure that cloud-init has finished by running
cloud-init status --wait - Check the output of
cloud-init statusand report an error to the user if cloud-init was not successful - Add a timeout for waiting for
cloud-init statusand report an error to the user if the timeout is reached - Remove the retry logic when configuring dstack-gateway or reduce the number of attempts
This should improve the user experience, facilitate troubleshooting, prevent bugs.
After #1236 we give gateway more than enough time to install and setup. If it takes more time for some reason, then we should fix the underlying problem. This issue only addresses the error messages, so I'd state it as minor.
This issue is stale because it has been open for 30 days with no activity.
This issue was closed because it has been inactive for 14 days since being marked as stale. Please reopen the issue if it is still relevant.
This issue is stale because it has been open for 30 days with no activity.
This issue was closed because it has been inactive for 14 days since being marked as stale. Please reopen the issue if it is still relevant.
This issue is stale because it has been open for 30 days with no activity.
@jvstme is this issue still valid?
@peterschmidt85, yes
This issue is stale because it has been open for 30 days with no activity.