Knative should not modify the istio-proxy container spec
Context
Istio provides 2 mechanisms for customising the spec of the istio-proxy sidecar container.
Limited customisation is possible via annotations (.e.g sidecar.istio.io/proxyCPU)
but you can also include a container named 'istio-proxy' with image: 'auto' in your pod spec.
The istio injection webhook then merges this with the rest of the required config for the sidecar to work correctly.
So you end up with a pod spec that looks something like
apiVersion: v1
kind: Pod
metadata:
name: foobar
spec:
containers:
- name: my-app
image: "foobar:latest"
- name: istio-proxy
image: 'auto'
resources: {...} # etc
The problem is that in knative-serving this container gets passed through makeContainer and the containerMask
Most of the time these modifications are pretty benign and don't cause any problems, a few superfluous env vars etc, no big deal.
However...
I am trying to enable the istio/kubernetes native sidecars functionality.
When this is active the istio injector mutating webhook adds a default lifecycle to the istio-proxy container.
This is incompatible with the lifecycle added by knative. You end up with httpGet and exec defined in the preStop and the cluster rejects the pod creation
lifecycle:
preStop:
httpGet:
path: /wait-for-drain
port: 8022
scheme: HTTP
exec:
command:
- pilot-agent
- request
- --debug-port=15020
- POST
- drain
spec.initContainers[1].lifecycle.preStop.httpGet: Forbidden: may specify more than 1 handler type
You could argue that the Istio mutating webhook should remove the httpGet field to ensure the preStop is valid.
But ideally knative shouldn't be interfering with this container spec either.
I've tested a quick fix in one of our dev clusters by adding this to the BuildUserContainers function
if rev.Spec.PodSpec.Containers[i].Name == "istio-proxy" {
containers = append(containers, rev.Spec.PodSpec.Containers[i])
continue
}
which works fine but there might be a better solution you would like to implement.
What version of Knative?
1.13.0
Expected Behavior
istio-proxy container should be untouched
Actual Behavior
Knative modifies the istio-proxy container spec resulting in an invalid pod spec preventing pods from being launched
Steps to Reproduce the Problem
Install istio and knative.
Enable native sidecars by setting values.pilot.env.ENABLE_NATIVE_SIDECARS=true in Helm.
Any knative services will fail to create pods
Hi @hamishforbes, makeContainer does the following:
func makeContainer(container corev1.Container, rev *v1.Revision) corev1.Container {
// Adding or removing an overwritten corev1.Container field here? Don't forget to
// update the fieldmasks / validations in pkg/apis/serving
container.Lifecycle = userLifecycle
container.Env = append(container.Env, getKnativeEnvVar(rev)...)
// Explicitly disable stdin and tty allocation
container.Stdin = false
container.TTY = false
if container.TerminationMessagePolicy == "" {
container.TerminationMessagePolicy = corev1.TerminationMessageFallbackToLogsOnError
}
if container.ReadinessProbe != nil {
if container.ReadinessProbe.HTTPGet != nil || container.ReadinessProbe.TCPSocket != nil || container.ReadinessProbe.GRPC != nil {
// HTTP, TCP and gRPC ReadinessProbes are executed by the queue-proxy directly against the
// container instead of via kubelet.
container.ReadinessProbe = nil
}
}
return container
}
Once you put a container in the Knative spec it is kind of expected to be managed by Knative and not all projects conform to that, so I view this is issue an enhancement request for the Istio integration. I suspect we could keep some of the logic and avoid lifecycle, readinessprobe for this special case (since we are integrating with Istio anyway) eg. have a special annotation that selects a specific container as a proxy or use the convention of the name above. I am wondering though if you could fix this via another webhook that modifies the pod (webhooks are applied based on alphabetical orde afaik, a bit of a hack). @dprotaso @ReToCode anything to add here?
This issue is stale because it has been open for 90 days with no
activity. It will automatically close after 30 more days of
inactivity. Reopen the issue with /reopen. Mark the issue as
fresh by adding the comment /remove-lifecycle stale.
/remove-lifecycle stale still an issue
This issue is stale because it has been open for 90 days with no
activity. It will automatically close after 30 more days of
inactivity. Reopen the issue with /reopen. Mark the issue as
fresh by adding the comment /remove-lifecycle stale.