manifests icon indicating copy to clipboard operation
manifests copied to clipboard

Restrict webhook access to kube-system API server

Open ViciousEagle03 opened this issue 9 months ago • 8 comments

Pull Request Template for Kubeflow Manifests

✏️ Summary of Changes

Modified the below files in the common/networkpolicies/base directory

  • kserve.yaml
  • pvcviewer-webhook.yaml
  • spark-operator-webhook.yaml
  • training-operator-webhook.yaml

Added the necessary restriction to allow access only from the apiserver namespace kube-system for webhooks.

Fixed YAML indentation issues that were previously undetected, as the workflow was not triggered on commits modifying the kserve.yaml, pvcviewer-webhook.yaml file.

📦 Dependencies

List any dependencies or related PRs (e.g., "Depends on #123").

🐛 Related Issues

  • Fixes #2928

✅ Contributor Checklist


You can join the CNCF Slack and access our meetings at the Kubeflow Community website. Our channel on the CNCF Slack is here #kubeflow-platform.

ViciousEagle03 avatar Mar 04 '25 11:03 ViciousEagle03

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Once this PR has been reviewed and has the lgtm label, please assign juliusvonkohout for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

google-oss-prow[bot] avatar Mar 04 '25 11:03 google-oss-prow[bot]

Thank you for the PR. Do you know whether

  - from:
    - podSelector:
        matchLabels:
          component: kube-apiserver
      namespaceSelector:
        matchLabels:
          kubernetes.io/metadata.name: kube-system

works for all popular clusters, so AKS, EKS, GKE, KIND; MINIKUBE; Rancher ?

juliusvonkohout avatar Mar 04 '25 13:03 juliusvonkohout

@juliusvonkohout So I did some research on locally hosted and managed clusters(GKE, AKS, EKS...) like the one you listed above, but I couldn't find any clear issues with the above approach. Are you suggesting using matchExpressions for more flexibility instead of matchLabels, or are you recommending exposing the entire namespace rather than targeting the specific API server? I'm a bit lost, could you point me in the right direction?

Edit: Also, I will add the restriction to the other network policies once we finalize the implementation to enforce the restrictions.

ViciousEagle03 avatar Mar 04 '25 19:03 ViciousEagle03

@juliusvonkohout So I did some research on locally hosted and managed clusters(GKE, AKS, EKS...) like the one you listed above, but I couldn't find any clear issues with the above approach. Are you suggesting using matchExpressions for more flexibility instead of matchLabels, or are you recommending exposing the entire namespace rather than targeting the specific API server? I'm a bit lost, could you point me in the right direction?

Edit: Also, I will add the restriction to the other network policies once we finalize the implementation to enforce the restrictions.

I would just like to get this tested on GKE, AKS, EKS and rancher before we merge it.

@tarekabouzeid @varodrig can you help with that?

juliusvonkohout avatar Mar 05 '25 10:03 juliusvonkohout

o get this tested on GKE, AKS, EKS and rancher before we merge it.

can we include OpenShift as well?

varodrig avatar Mar 05 '25 14:03 varodrig

o get this tested on GKE, AKS, EKS and rancher before we merge it.

can we include OpenShift as well?

Yes, please test on what you have available and report back.

juliusvonkohout avatar Mar 05 '25 15:03 juliusvonkohout

@ViciousEagle03 sorry for the delay. it might be that we only merge this after the 1.10.0 release.

juliusvonkohout avatar Mar 19 '25 12:03 juliusvonkohout

I am just waiting for test feedback to merge this.

juliusvonkohout avatar Apr 07 '25 15:04 juliusvonkohout

This selector works well on self-hosted clusters such as KIND, Minikube and Rancher. In these environments, the control plane components typically run as static pods directly on the control-plane nodes. These pods are visible within the cluster's kube-system namespace and carry the standard component: kube-apiserver label.

I think that selector will not work for all popular clusters, especially the major managed cloud providers. The main point of failure is the podSelector targeting component: kube-apiserver. Its success depends entirely on how a specific Kubernetes distribution runs its control plane. This selector might fail on managed Kubernetes services such as AKS, EKS and GKE. On these platforms, the control plane, including the kube-apiserver, is managed by the cloud provider and runs outside of your cluster. Since there is no pod with the label component: kube-apiserver inside your cluster, the podSelector will match nothing, and the network policy rule will not work as intended.

To create a network policy that reliably allows traffic from the control plane across all cluster types, you can try to use an ipBlock. This method specifies the IP address range of the control plane instead of trying to select a pod. For managed clusters, you can find the specific IP range for your cluster's control plane in the cloud provider's documentation. For self-hosted clusters, this would be the IP addresses of your control-plane nodes. Nevertheless getting the correct CIDR is often not feasible and sometimes the CIDR is not static.

Example:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-from-control-plane
spec:
  podSelector:
    matchLabels:
      app: my-webhook-application
  ingress:
  - from:
    - ipBlock:
        cidr: 172.16.0.0/28 # Getting the correct CIDR is often not feasible and sometimes the CIDR is not static.
  - Ingress

juliusvonkohout avatar Sep 23 '25 18:09 juliusvonkohout

closed due to inactivity

juliusvonkohout avatar Oct 28 '25 13:10 juliusvonkohout