workflow Upgrade Issue with bringing in kube-registry-proxy

In order to switch over from our in-house registry-proxy to the official/upstream kube-registry-proxy (as original PR https://github.com/deis/workflow/pull/734 proposed) we will need to sort out the following issue when upgrading.

v2.12.0 release candidate testing showed that after a Workflow install that uses the in-house variant of deis-registry-proxy (say, v2.11.0), when one goes to upgrade (helm upgrade luminous-hummingbird workflow-staging/workflow --version v2.12.0), although the deis-registry-proxy pod appears to have been removed, the new luminous-hummingbird-kube-registry-proxy sometimes does not appear due to a host port conflict:

 $ helm ls
NAME                	REVISION	UPDATED                 	STATUS  	CHART           	NAMESPACE
luminous-hummingbird	4       	Wed Mar  8 14:01:02 2017	DEPLOYED	workflow-v2.12.0	deis

$ kd get po,ds
NAME                                        READY     STATUS    RESTARTS   AGE
po/deis-builder-574483744-qnf44             1/1       Running   0          24m
po/deis-controller-3953262871-jqkmd         1/1       Running   2          24m
po/deis-database-83844344-m5x4x             1/1       Running   0          24m
po/deis-logger-176328999-d7fxc              1/1       Running   9          1h
po/deis-logger-fluentd-0hqfs                1/1       Running   0          1h
po/deis-logger-fluentd-drfh6                1/1       Running   0          1h
po/deis-logger-redis-304849759-nbrdp        1/1       Running   0          1h
po/deis-minio-676004970-g2bj9               1/1       Running   0          1h
po/deis-monitor-grafana-432627134-87b1z     1/1       Running   0          24m
po/deis-monitor-influxdb-2729788615-q67f9   1/1       Running   0          25m
po/deis-monitor-telegraf-6q562              1/1       Running   0          1h
po/deis-monitor-telegraf-rzwnv              1/1       Running   6          1h
po/deis-nsqd-3597503299-94nhx               1/1       Running   0          1h
po/deis-registry-756475849-v0rmw            1/1       Running   0          24m
po/deis-router-1001573613-mk07g             1/1       Running   0          13m
po/deis-workflow-manager-1013677227-kh5vt   1/1       Running   0          25m

NAME                                          DESIRED   CURRENT   READY     NODE-SELECTOR   AGE
ds/deis-logger-fluentd                        2         2         2         <none>          1h
ds/deis-monitor-telegraf                      2         2         2         <none>          1h
ds/luminous-hummingbird-kube-registry-proxy   0         0         0         <none>          24m

 $ kd describe ds luminous-hummingbird-kube-registry-proxy
Name:		luminous-hummingbird-kube-registry-proxy
Image(s):	gcr.io/google_containers/kube-registry-proxy:0.4
Selector:	app=luminous-hummingbird-kube-registry-proxy
Node-Selector:	<none>
Labels:		chart=kube-registry-proxy-0.1.0
		heritage=Tiller
		release=luminous-hummingbird
Desired Number of Nodes Scheduled: 0
Current Number of Nodes Scheduled: 0
Number of Nodes Misscheduled: 0
Pods Status:	0 Running / 0 Waiting / 0 Succeeded / 0 Failed
Events:
  FirstSeen	LastSeen	Count	From			SubObjectPath	Type		Reason		Message
  ---------	--------	-----	----			-------------	--------	------		-------
  25m		25m		2	{daemonset-controller }			Normal		FailedPlacement	failed to place pod on "k8s-agent-fbf26383-0": host port conflict
  25m		25m		2	{daemonset-controller }			Normal		FailedPlacement	failed to place pod on "k8s-master-fbf26383-0": host port conflict

Mar 08 '17 21:03 vdice

let's see if we can distill this into a base case which we can hopefully ship a PR and functional test upstream to helm.

Mar 08 '17 21:03 bacongobbler

It is a possibility that this is due to a k8s regression (been running v1.5.x in my testing); perhaps related: https://github.com/kubernetes/kubernetes/issues/23013

Mar 08 '17 23:03 vdice

Adding this to the v2.15 milestone. We'll want to re-try this on a v1.6.x cluster. As it stands, we've added deis/registry-proxy back into CI as features have come in with the Workflow v2.14 milestone.

May 02 '17 17:05 vdice

This issue was moved to teamhephy/workflow#27

Mar 20 '18 20:03 Cryptophobia