csi-proxy icon indicating copy to clipboard operation
csi-proxy copied to clipboard

Run Windows node CSI Drivers as HostProcess containers

Open mauriciopoppe opened this issue 3 years ago • 11 comments

The HostProcess container feature became beta in 1.23, we'd like to leverage it in CSI Drivers which will run as privileged jobs in the Windows host, there'll be more details about the transition steps in the design doc.

By making the CSI Driver a HostProcess pod we no longer need the binary in the client/server model (although we will still support it). There are some items for the maintenance of the current client/server model of CSI Proxy.

Tasks

  • [x] https://github.com/kubernetes-csi/csi-proxy/issues/222
  • [x] https://github.com/kubernetes-csi/csi-proxy/issues/223
  • [x] https://github.com/kubernetes-csi/csi-proxy/issues/224
  • [x] https://github.com/kubernetes-csi/csi-proxy/issues/225
  • [x] Design doc: Using Windows HostProcess containers in CSI Drivers running in Windows nodes @mauriciopoppe created https://github.com/kubernetes/enhancements/issues/3636
  • [x] Blogpost draft: https://docs.google.com/document/d/18WwuKSONG_qkD3vaoppL_ikeDL0Ms0Wwv7HsyN8Oel0/edit

Items for the refactor of CSI Proxy to become a go library:

Tasks

  • [x] https://github.com/kubernetes-csi/csi-proxy/issues/226
  • [x] https://github.com/kubernetes-csi/csi-proxy/issues/227
  • [x] https://github.com/kubernetes-csi/csi-proxy/issues/232
  • [x] https://github.com/kubernetes-csi/csi-proxy/issues/234
  • [x] https://github.com/kubernetes-csi/csi-proxy/issues/235
  • [x] https://github.com/kubernetes-csi/csi-proxy/issues/236
  • [x] https://github.com/kubernetes-csi/csi-proxy/issues/237
  • [x] https://github.com/kubernetes-csi/csi-proxy/issues/238
  • [x] https://github.com/kubernetes-csi/csi-proxy/issues/239
  • [x] https://github.com/kubernetes-csi/csi-proxy/issues/241
  • [x] https://github.com/kubernetes-csi/csi-proxy/issues/242
  • [x] https://github.com/kubernetes-csi/csi-proxy/issues/243
  • [x] https://github.com/kubernetes-csi/csi-proxy/issues/246
  • [x] https://github.com/kubernetes-csi/csi-proxy/issues/248
  • [x] https://github.com/kubernetes-csi/csi-proxy/issues/249
  • [x] https://github.com/kubernetes-csi/csi-proxy/issues/250
  • [x] https://github.com/kubernetes-csi/csi-proxy/issues/253
  • [x] https://github.com/kubernetes-csi/csi-proxy/issues/265
  • [x] https://github.com/kubernetes-csi/csi-proxy/issues/264

/assign @mauriciopoppe

mauriciopoppe avatar Sep 15 '22 01:09 mauriciopoppe

@mauriciopoppe if the work to migrate to hostprocess has not merged yet, can we do a release now with the existing workflow and then create a branch?

msau42 avatar Oct 06 '22 01:10 msau42

@msau42 yeah we can do that, I wanted to try to release in the v1.x branch so that if we need to have a new release we would have everything set up (if that's needed although that might not happen). I'll proceed with the release in the master branch.

mauriciopoppe avatar Oct 06 '22 02:10 mauriciopoppe

Updates about the ongoing work:

  • The migration effort is complete and it happened in the library-development branch, unit tests and integration tests are running directly in the host.
  • We created an alpha tag https://github.com/kubernetes-csi/csi-proxy/releases/tag/v2.0.0-alpha.0 that can already be used for testing, it's alpha and must not be used in production yet
  • Ongoing effort in the PD CSI Driver to use the alpha tag https://github.com/kubernetes-sigs/gcp-compute-persistent-disk-csi-driver/pull/1071, we haven't run presubmit tests with it properly yet (waiting on test-infra fixes)

mauriciopoppe avatar Dec 14 '22 22:12 mauriciopoppe

Following up on @mauriciopoppe 's points:

  • We are drafting's a release blog post for CIS Proxy v2. Any feedback would be greatly appreiated.

alexander-ding avatar Dec 15 '22 20:12 alexander-ding

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Mar 15 '23 21:03 k8s-triage-robot

/remove-lifecycle stale /lifecycle frozen

mauriciopoppe avatar Apr 07 '23 19:04 mauriciopoppe

For anyone interested, there is no reason to wait for v2 to use host process...

This is all the yaml you need to get it working.

apiVersion: apps/v1
kind: DaemonSet
metadata:
  labels:
    k8s-app: csi-proxy
  name: csi-proxy
  namespace: kube-system
spec:
  selector:
    matchLabels:
      k8s-app: csi-proxy
  template:
    metadata:
      labels:
        k8s-app: csi-proxy
    spec:
      nodeSelector:
        "kubernetes.io/os": windows
      securityContext:
        windowsOptions:
          hostProcess: true
          runAsUserName: "NT AUTHORITY\\SYSTEM"
      hostNetwork: true
      containers:
        - name: csi-proxy
          image: ghcr.io/kubernetes-sigs/sig-windows/csi-proxy:v1.1.2

davhdavh avatar Jul 06 '23 07:07 davhdavh

For anyone interested, there is no reason to wait for v2 to use host process...

This is all the yaml you need to get it working.


apiVersion: apps/v1

kind: DaemonSet

metadata:

  labels:

    k8s-app: csi-proxy

  name: csi-proxy

  namespace: kube-system

spec:

  selector:

    matchLabels:

      k8s-app: csi-proxy

  template:

    metadata:

      labels:

        k8s-app: csi-proxy

    spec:

      nodeSelector:

        "kubernetes.io/os": windows

      securityContext:

        windowsOptions:

          hostProcess: true

          runAsUserName: "NT AUTHORITY\\SYSTEM"

      hostNetwork: true

      containers:

        - name: csi-proxy

          image: ghcr.io/kubernetes-sigs/sig-windows/csi-proxy:v1.1.2

Hi there! Thanks for your interest in the project :)

It's been a bit since I've worked on this, but I'm fairly certain this shouldn't work, or at least this doesn't do what we are intending to do.

The CSI Proxy v1 container is a proxy server that relays client API calls to a privileged binary directly running on Windows machines (outside of Kubernetes). The point of v2 is to completely eliminate the need for the separate privileged binary. It is true that HostProcess containers (HPCs) are available already, but just running the v1 proxy server container as an HPC doesn't actually eliminate the need for the separate binary. Also, because HPCs don't support named pipes or unix sockets, the proxy server would likely fail to connect to binary completely.

alexander-ding avatar Jul 06 '23 12:07 alexander-ding

ghcr.io/kubernetes-sigs/sig-windows/csi-proxy:v1.1.2 includes the required binary, hostProcess: true gives it privilege.

Not sure why the HPC would need to support pipes or sockets specifically, it just runs like a normal process in the host and thus native support.

I'm fairly certain this shouldn't work

It works just fine; with smb atleast

The point of v2 is to completely eliminate the need for the separate privileged binary

Sure, a totally valid goal, but as it isn't available today, I gave a solution for the thing is available today.

davhdavh avatar Jul 09 '23 11:07 davhdavh

What you propose was already implemented by the Windows team a while ago in https://github.com/kubernetes-sigs/sig-windows-tools/blob/master/hostprocess/csi-proxy/README.md, while it's running only CSI Proxy as a HostProcess Pod our goal is to leverage the HostProcessContainers feature to improve other parts of the entire solution (e.g. maintainability, parity between Linux & Windows implementations).

So while you're using HPC your CSI Driver deployment is different between Linux/Windows still, this is the problem we'd like to solve.

mauriciopoppe avatar Jul 10 '23 16:07 mauriciopoppe

ghcr.io/kubernetes-sigs/sig-windows/csi-proxy:v1.1.2 includes the required binary, hostProcess: true gives it privilege.

Not sure why the HPC would need to support pipes or sockets specifically, it just runs like a normal process in the host and thus native support.

I'm fairly certain this shouldn't work

It works just fine; with smb atleast

The point of v2 is to completely eliminate the need for the separate privileged binary

Sure, a totally valid goal, but as it isn't available today, I gave a solution for the thing is available today.

Yeah, you're totally right. For some reason I thought you were suggesting running the node plugins/registrars as HPCs, instead of running the privileged binary as an HPC. My bad on this.

alexander-ding avatar Jul 11 '23 05:07 alexander-ding