cluster-api icon indicating copy to clipboard operation
cluster-api copied to clipboard

Need proxy support in air-gapped environment

Open hanlins opened this issue 4 years ago • 28 comments

User Story

As an operator, I would like to add proxy setting configurations to capi providers for the air-gapped environments.

Detailed Description

In air-gapped environment, cluster API provider pods might be deployed in air-gapped environment, and thus cannot talk to the infrastructure provider directly. In this scenario, a proxy server is typically deployed to enable the connectivity and audit the traffic that bypasses the firewall. It would be ideal if we can have a mechanism to plumb the proxy server configurations to the cluster API provider pods, so that they can be able to communicate with the infrastructure.

Anything else you would like to add: One approach I think think of is to have something like this:

HTTP_PROXY=xxx clusterctl init

The implementation should be similar to https://github.com/kubernetes/kubernetes/pull/84559.

[Miscellaneous information that will assist in solving the issue.]

/kind feature

hanlins avatar May 07 '21 21:05 hanlins

For such scenario we would also want the ability to configure https_proxy and no_proxy.

We'd need to flesh out details here, define and agree on what an air gapped env is and what scenarios and behaviour exactly we want to support end to end, e.g would this be a one shot thing? or would we want capi components to watch a "proxy config" and react to changes there? I think this will probably deserve a proposal having all the details.

enxebre avatar May 10 '21 10:05 enxebre

@hanlins I'm starting to think about this use case, and my main concern is that adding proxy settings can't be achieved by simple variable substitution, which is the only templating solution supported in clusterctl as of today. The only two options I can see here are:

  • to rely on different templating solutions injected in the clusterctl library
  • use mutating web hooks

Also, the ongoing work on ManagedCluster might provide some help here, but this is still TBD IF this can help, I'm happy to chat about this

fabriziopandini avatar May 10 '21 13:05 fabriziopandini

/milestone Next

vincepri avatar Jul 06 '21 17:07 vincepri

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Oct 04 '21 18:10 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar Nov 03 '21 18:11 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

k8s-triage-robot avatar Dec 03 '21 19:12 k8s-triage-robot

@k8s-triage-robot: Closing this issue.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Dec 03 '21 19:12 k8s-ci-robot

/reopen

We just encountered a customer that needs this, too.

It could be done through templating in cmd/clusterctl/client/repository.NewComponents with an option that contains the values for https_proxy, http_proxy, and no_proxy.

joejulian avatar Feb 01 '22 21:02 joejulian

@joejulian: You can't reopen an issue/PR unless you authored it or you are a collaborator.

In response to this:

/reopen

We just encountered a customer that needs this, too.

It could be done through templating in cmd/clusterctl/client/repository.NewComponents with an option that contains the values for https_proxy, http_proxy, and no_proxy.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Feb 01 '22 21:02 k8s-ci-robot

/reopen

dlipovetsky avatar Feb 01 '22 22:02 dlipovetsky

@dlipovetsky: Reopened this issue.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Feb 01 '22 22:02 k8s-ci-robot

/lifecycle frozen

dlipovetsky avatar Feb 02 '22 21:02 dlipovetsky

/assign @ykakarap Can you please assess if it would be possible to extend clusterctl to inject http proxy env vars in the provider manifests.

sbueringer avatar Feb 11 '22 18:02 sbueringer

/milestone v1.2

fabriziopandini avatar Feb 11 '22 18:02 fabriziopandini

Hey I left a message on the #cluster-api slack channel to no avail :( Is it possible to get involved with the effort here? What's the criteria that we're going to be using to asses if this is possible or not? I'd love to see this feature happen so please let me know where I can help.

faiq avatar Feb 14 '22 17:02 faiq

Catching up on the issue. Will get back soon. :)

@faiq I will take a look at this and post my findings here.

ykakarap avatar Feb 15 '22 08:02 ykakarap

/triage accepted /unassign @ykakarap

@joejulian could you share how you fixed this problem as per https://github.com/kubernetes-sigs/cluster-api/issues/4585#issuecomment-1027310851 so someone can pick up the work in CAPI /help

fabriziopandini avatar Oct 03 '22 17:10 fabriziopandini

@fabriziopandini: This request has been marked as needing help from a contributor.

Guidelines

Please ensure that the issue body includes answers to the following questions:

  • Why are we solving this issue?
  • To address this issue, are there any code changes? If there are code changes, what needs to be done in the code and what places can the assignee treat as reference points?
  • Does this issue have zero to low barrier of entry?
  • How can the assignee reach out to you for help?

For more details on the requirements of such an issue, please see here and ensure that they are met.

If this request no longer meets these requirements, the label can be removed by commenting with the /remove-help command.

In response to this:

/triage accepted /unassign @ykakarap

@joejulian could you share how you fixed this problem as per https://github.com/kubernetes-sigs/cluster-api/issues/4585#issuecomment-1027310851 so someone can pick up the work in CAPI /help

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Oct 03 '22 17:10 k8s-ci-robot

@fabriziopandini we modify the core-components.yaml file with this kustomization overlay

apiVersion: apps/v1
kind: Deployment
metadata:
  name: NA
spec:
  template:
    spec:
      containers:
        - name: manager
          env:
            - name: HTTP_PROXY
              value: ${HTTP_PROXY:=""}
            - name: HTTPS_PROXY
              value: ${HTTPS_PROXY:=""}
            - name: NO_PROXY
              value: ${NO_PROXY:=""}

faiq avatar Oct 03 '22 17:10 faiq

Sounds like we have at least 3 options, in order from "least work required" to "most work required" from our users:

  1. Include these env variables in the manifest for the core provider.
  2. Document how to add these env variables by patching the manifest, e.g. with kustomize.
  3. Document how to use a mutating webhook to set these env variables.

(In every cases, users need to include information like the Pods and Services CIDRs in the NO_PROXY variable, along with fixed values like localhost, etc.)

dlipovetsky avatar Oct 03 '22 18:10 dlipovetsky

@fabriziopandini I don't remember what we did (and I don't work there anymore so I can't go back and check).

joejulian avatar Oct 04 '22 15:10 joejulian

Sounds like we have at least 3 options, in order from "least work required" to "most work required" from our users:

1. Include these env variables in the manifest for the core provider.

2. Document how to add these env variables by patching the manifest, e.g. with kustomize.

3. Document how to use a mutating webhook to set these env variables.

(In every cases, users need to include information like the Pods and Services CIDRs in the NO_PROXY variable, along with fixed values like localhost, etc.)

I think it's obvious I support 1. :)

  1. Seems odd that we'd rebuild this entire toolset around templating but this one bit we'd require using kustomize.
  2. How would this webhook be installed without the proxy config?

joejulian avatar Oct 04 '22 15:10 joejulian

I agree that adding env var to the manifest is the simplest way forward, my only concern is that in the past we got push-back for this type of change by folks using git-ops and trying to use yaml files directly (and in fact there is https://github.com/kubernetes-sigs/cluster-api/issues/3881 asking to remove all the variables we currently have).

fabriziopandini avatar Oct 05 '22 12:10 fabriziopandini

I've never been a fan of adding the complexity of templating to cluster-api a la ClusterClass, but the community felt the return was worth it. Embracing that change; I'm not sure, now, where the distinction lies between one form of templating and another. Is there a way to solve this that's more in line with ClusterClass, maybe?

joejulian avatar Oct 05 '22 16:10 joejulian

Q: 1. Include these env variables in the manifest for the core provider.

In air-gapped environment, cluster API provider pods might be deployed in air-gapped environment, and thus cannot talk to the infrastructure provider directly.

Just for my understanding. For which connections do we need the http proxy configuration?

  1. communication from CAPI to infra provider APIs (AWS,Azure,...)
  2. communication from CAPI to workload clusters
  3. both

I'm just a bit confused because the original ask was for the infra provider, but core CAPI is not accessing it. And having it consistently in infra providers would require agreement with infra providers (maybe an addition to the contract)

sbueringer avatar Oct 05 '22 17:10 sbueringer

  1. communication from workload clusters to endpoints (registry, internet, ...)

chrischdi avatar Oct 06 '22 10:10 chrischdi

4. communication from workload clusters to endpoints (registry, internet, ...)

Should be probably from controllers / mgmt cluster to registry/internet?

I think the issue is about setting proxy for CAPI providers/controllers only (based on the PR description).

But based on the title it could be proxy support in general.

sbueringer avatar Oct 06 '22 13:10 sbueringer

I don't think you can add generalized proxy support. There's no standard.

joejulian avatar Oct 06 '22 21:10 joejulian

Sounds like we have at least 3 options, in order from "least work required" to "most work required" from our users: Include these env variables in the manifest for the core provider. Document how to add these env variables by patching the manifest, e.g. with kustomize. Document how to use a mutating webhook to set these env variables. (In every cases, users need to include information like the Pods and Services CIDRs in the NO_PROXY variable, along with fixed values like localhost, etc.)

Agreed, at minimum we could provide some guidance docs

/kind documentation

enxebre avatar Jun 30 '23 12:06 enxebre

/priority backlog

fabriziopandini avatar Apr 12 '24 14:04 fabriziopandini