Issues
Issues copied to clipboard
Tentacle upgrade should block machines instead of tasks
Currently Tentacle upgrade will block tasks queued behind it from running so they don't attempt to run on a restarting Tentacle. It is somewhat selective but clumsy. If there is a problem with Tentacle upgrade (for example updating Calamari gets stuck), the Tentacle upgrade task can block other tasks indefinitely.
We have a script isolation mutex but it does not help Tentacle upgrade because each step of the upgrade takes out the mutex. Another task can acquire the mutex while Tentacle upgrade is between steps and run on a restarting Tentacle.
I think this issue can be resolved by wrapping the entire Tentacle upgrade process in the script isolation mutex. Each script step will need to be run without acquiring the mutex, which I believe is different to anything we do currently. With this approach, if a machine is blocking the Tentacle upgrade from progressing only tasks that involved that particular machine will be blocked rather than the entire task queue.
I ran into this issue today. We had a Tentacle upgrade hang and blocked deployments to our auto scaling infrastructure. Hopefully it was a one time occurrence, but ideally it would never be an issue.
Another report: https://help.octopus.com/t/upgrade-all-tentacles-prevents-any-deployment-task-to-be-executed/19766
Another report (private link): https://secure.helpscout.net/conversation/605611169/28330?folderId=557077
Another report (private link): https://secure.helpscout.net/conversation/760047726/38240?folderId=2271904
Hi, we raised this on Thursday, June 21, 2018 9:44 AM after upgrading to 2018.6.5
We are now on LTS (2018.10.0) and this is still occurring.
On any version upgrade that requires a tentacle upgrade our deployment lead time becomes exponentially pushed due to us needing to plan the upgrade out of hours (we have a large infrastructure so upgrading 10K tentacles in one server task is going to slow down our pipeline somewhat if all other tasks are queued behind the upgrade)
Another Report of this issue. https://octopus.zendesk.com/agent/tickets/67509
Note that tentacle upgrades are not required to deploy. We support deploying to Tentacle 3.0.
I suggest working around this problem by setting an "outage" time to upgrade the tentacles and kicking it off then. Perhaps do it in batches. I realize this may be annoying with large installs, so we will keep this issue open.
another report (private link): https://octopus.zendesk.com/agent/tickets/84752
Another report (Internal ticket) - https://octopus.zendesk.com/agent/tickets/113079