for-azure
for-azure copied to clipboard
Update doesn't work on edge channel
I want to upgrade to Docker 17.05.0-ce but the upgrade.sh script fails.
Expected behavior
upgrade.sh https://download.docker.com/azure/edge/Docker.tmpl
Actual behavior
upgrade.sh https://download.docker.com/azure/edge/Docker.tmpl
executing upgrade on d12eb6d1e505
File "/usr/bin/azupgrade.py", line 402
subprocess.check_output(["docker", "node", "demote", node_id])
^
IndentationError: unindent does not match any outer indentation level
Information
Client: Version: 17.04.0-ce API version: 1.28 Go version: go1.7.5 Git commit: 4845c56 Built: Tue Apr 4 00:37:25 2017 OS/Arch: linux/amd64
Server: Version: 17.04.0-ce API version: 1.28 (minimum version 1.12) Go version: go1.7.5 Git commit: 4845c56 Built: Tue Apr 4 00:37:25 2017 OS/Arch: linux/amd64 Experimental: false
We have a swarm cluster with 3 masters and 2 workers. I called the script on one manager node (not the master).
Not working at all. Where's the azupgrade.py in this project?
Thanks for reporting the issue. We will publish a fix/workaround shortly. @vovimayhem the script is in the guide container running in the manager nodes and needs to be patched up.
Are there any updates on this? :)
Yes! @manixx You can now kick off an upgrade to docker to 17.05 (from 17.04) using the following command/container (rather than upgrade.sh earlier)
docker run -v /var/run/docker.sock:/var/run/docker.sock -v /usr/bin/docker:/usr/bin/docker -ti docker4x/upgrade-azure:17.05.0-ce-azure2
We will be updating the docs with the new mechanism once 17.06 is out.
Sorry to reopen this, the update didn't worked on our cluster. :/ We have a 5 node cluster (3 masters, 2 worker).
The update procedure seemed fine (updated each node after another) but after the update, the first manager (the one where I started the update) didn't reconnect to the cluster by itself. The other two masters connected but the I was unable to ask the status (docker node ls, it caused an context deadline exceeded) and the Docker version was still 17.04 on all machines.
Now i reimaged all machine to its inital state, started the the Swarm mode on the first master node manually and restarted all other machines.
@manixx If you navigate to the Deployment under the resource group in the Azure portal, do you see any errors? From your description of symptoms it seems very likely the initial deployment update (before the node updates) could not be correctly applied.
Also, what was the environment you were upgrading from? Did you deploy your initial swarm using the link from docs.docker.com (i.e. not through cloud)
In the Deployments-Tab there were no errors, but i found an "Write Deployments" error inside the Activity-Log of the manager scale set.
Deployment template validation failed: 'The template parameters 'registryLocation, registryName, adminUserEnabled, registrySku, storageAccountSku, storageAccountName, registryApiVersion' in the parameters file are not valid; they are not present in the original template and can therefore not be provided at deployment time. The only supported parameters for this template are 'adServicePrincipalAppID, adServicePrincipalAppSecret, enableSystemPrune, managerCount, managerVMSize, sshPublicKey, swarmName, workerCount, workerVMSize'. Please see https://aka.ms/arm-deploy/#parameter-file for usage details.'.
This is the only error in the log.
Initially I installed the Edge Channel from the official Docker page. If I can help you just let me know! :)
Well that's a very interesting error. It seems to indicate that somehow you tried to deploy the default template associated with the independent Docker4Azure VMs rather than the Docker4Azure swarm template. How does the timestamp of the above correlate with when you tried the upgrade?
For the Manager VMSS resources, do you see anything in the Activity Log where the Operation Name is "Manual Upgrade"?
Also just to confirm, your Deployment status for the resource group in the Azure portal says "Succeeded" even after the failed upgrade attempt, correct?
Initially (the first deployment of the resource group) we used this template to setup the cluster.
I do not see any Manual Upgrade in the Activity log sadly :/
Yes exactly. There is only the inital deployment there.
If you do not see any ManualUpgrade events, I would guess the upgrade script issued the overall deployment upgrade API call but Azure simply returned success but failed a bit later - a condition I have noticed very occasionally. If you get a chance, can you retry the upgrade again and keep an eye on the initial logs from the upgrade as well as the Deployment tabs? You can get the update logs anytime by issuing docker logs editions_guide after running the upgrade container.
I will re-run the update again in two weeks (around the 4th july) and I'll give a the exact logs of the container! :) I must be careful, because the last time i had to re-install all running containers. I'll give you feedback then!
Thank you @manixx and appreciate your help and feedback with investigating the issue.
I did find a bug where VM enumeration from the Azure side seems to have changed slightly leading to none of the upgrades actually taking place - the upgrade script/container would go through very quickly and exit. This has been fixed in docker4x/upgrade-azure:17.06.0-ce-aws1
In my case I had additional Deployments attached to the resource group and it used the wrong one:
Unable to find image 'docker4x/upgrade-azure:17.06.0-ce-azure1' locally
17.06.0-ce-azure1: Pulling from docker4x/upgrade-azure
019300c8a437: Pull complete
4d77251a915d: Pull complete
99cc8ec5e0d8: Pull complete
30591f8a8c96: Pull complete
0238be837d6e: Pull complete
aa3b9c543797: Pull complete
5120ca55ece4: Pull complete
22ca435c3db0: Pull complete
78a6e8adcb99: Pull complete
dad4f7185291: Pull complete
Digest: sha256:c9dd5a6416388e1cdf840cf5af010523a76df4d22ececf0c328e6fc8ad3ca108
Status: Downloaded newer image for docker4x/upgrade-azure:17.06.0-ce-azure1
Copying upgrade script ...
Kicking off upgrade to https://download.docker.com/azure/stable/17.06.0/Docker.tmpl ...
INFO: Validate Template URL to upgrade to
INFO: Initiating upgrade. Create queue to prevent another simultaneous upgrade.
INFO: Updating Resource Group template. This will take several minutes. You can follow the status of the upgrade below or from the Azure console using the URL below:
INFO: https://portal.azure.com/#resource/subscriptions/77d0f8b4-ec1d-4515-8f90-ff225cd87243/resourceGroups/docker_swarm/overview
INFO: Updating Resource Group: docker_swarm
INFO: Inspecting deployment: Microsoft.VirtualNetworkGateway-20170630102333 at state Succeeded
INFO: Inspecting deployment: Microsoft.VirtualNetworkGateway-20170630092102 at state Succeeded
INFO: Inspecting deployment: Microsoft.Template at state Succeeded
INFO: Found deployment: Microsoft.VirtualNetworkGateway-20170630102333 deployed at 2017-06-30 08:56:16.330641+00:00
The upgrade went through, but I was on the same docker version as before.
I deleted those deployments and after that the upgrade worked as expected. So maybe a check should be added to get the deployment with the correct name/type or something.
@tisoft thanks for reporting the multi deployment scenario. I will look at providing a way to the user to select the deployment they want upgraded.
@ddebroy Sadly I wasn't able to re-run the update script again. We already updated to the Stable channel. Sorry.