source-controller icon indicating copy to clipboard operation
source-controller copied to clipboard

source controller needs proxy to access github.com when running inside corporate network

Open nab-gha opened this issue 3 years ago • 24 comments

In order to get the source controller to access a github.com repository from within my client's corporate network I needed to add the following environmental variables to the source-controller deployment.

    - name: HTTPS_PROXY
      value: http://http.proxy.abc.com:8000
    - name: NO_PROXY
      value: 10.0.0.0/8

Maybe gotk install could be enhanced to generate this?

nab-gha avatar Sep 04 '20 09:09 nab-gha

You could create a kustomize patch after running gotk install --export and add those values to all controllers.

stefanprodan avatar Sep 04 '20 09:09 stefanprodan

Another way to make this easier: https://github.com/fluxcd/toolkit/discussions/234

In which case --http-proxy could be an option to gotk create source git/helm

seaneagan avatar Sep 14 '20 20:09 seaneagan

Can https://github.com/fluxcd/flux/issues/2179 be re-implemented for fluxv2? Just setting http_proxy won't cut it when you want to checkout over ssh (to avoid PATs and use deployment keys which are tied to ssh).

For flux v1 I installed it with values like:

ssh:
  known_hosts: "[ssh.github.com]:443 ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAq2A7hRGmdnm9tUDbO9IDSwBK6TbQa+PXYPCPy6rbTrTtw7PHkccKrpp0yVhp5HdEIcKr6pLlVDBfOLX9QUsyCOV0wzfjIJNlGEYsdlLJizHhbn2mUjvSAHQqZETYP81eFzLQNnPHt4EVVUh7VfDESU84KezmD5QlWpXLmvU31/yMf+Se8xhHTvKSCZIFImWwoG6mbUoWf9nzpIoaSjB+weqqUUmpaaasXVal72J+UX2B+2RPW3RcT0eOzQgqlJL3RKrTJvdsjE3JEAvGq3lGHSZXy28G3skua2SmVi/w4yCE6gbODqnTWlg7+wC604ydGXA8VJiS5ap43JXiUFFAaQ=="
  config: |
    Host github.com
    ProxyCommand socat STDIO PROXY:proxy.fqdn.com:%h:%p,proxyport=8080
    User git
    Hostname ssh.github.com
    Port 443
    IdentityFile /etc/fluxd/ssh/identity

and need something similar in flux v2

davidkarlsen avatar Jan 07 '21 08:01 davidkarlsen

We're also blocked from migrating to v2 because of this.

And yes, http_proxy won't cut it, as we need to pass the same ProxyCommand @davidkarlsen is referring to.

dvianello avatar Jan 27 '21 18:01 dvianello

Ability to configure HTTP/S and/or SSH client details is related to #93

hiddeco avatar Jan 28 '21 15:01 hiddeco

ProxyCommand socat STDIO PROXY:proxy.fqdn.com:%h:%p,proxyport=8080

It might be possible to support this by modifying the source-controller /etc/hosts to point to a socat side-car or Service.

stealthybox avatar Jan 28 '21 16:01 stealthybox

ProxyCommand socat STDIO PROXY:proxy.fqdn.com:%h:%p,proxyport=8080

It might be possible to support this by modifying the source-controller /etc/hosts to point to a socat side-car or Service.

I like this idea, nice and separated in a way. So I tried this:

  • run a container [1]
  • add a svc for it [2]
  • add hostentry for github.com [3] in src ctrl

but then src ctrl says:

{"level":"error","ts":"2021-01-28T21:37:58.369Z","logger":"controller.gitrepository","msg":"Reconciler error","reconciler group":"source.toolkit.fluxcd.io","reconciler kind":"GitRepository","name":"test","namespace":"flux-system","error":"unable to clone 'ssh://[email protected]/acme/test-fluxv2', error: ssh: handshake failed: EOF"}
k get gitrepository test
NAME   URL                                                    READY   STATUS                                                                                                      AGE
test   ssh://[email protected]:443/acme/test-fluxv2.git   False   unable to clone 'ssh://[email protected]:443/acme/test-fluxv2.git', error: ssh: handshake failed: EOF   44m

[1]

args:
  - tcp-listen:8080,fork,reuseaddr
  - proxy:proxy.acme.org:github.com:443,proxyport=3128
command:
- socat

[2]

k get svc gh-proxy
NAME       TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)          AGE
gh-proxy   ClusterIP   10.201.37.75   <none>        22/TCP,443/TCP   60m

[3]

- hostnames:
  - github.com
  ip: 10.201.37.75

I think maybe the next hurdle is https://docs.github.com/en/github/authenticating-to-github/using-ssh-over-the-https-port

davidkarlsen avatar Jan 28 '21 21:01 davidkarlsen

Update, I had missed an essential thing, the host is named ssh.github.com, when you run on port 443, we have a winner 🎉

k describe gitrepository test|tail -1
  Normal  info    75s                  source-controller  Fetched revision: main/6cb9fb47c005e5941233e7f69cb720c28455a89c

sidecar cmd:

k exec  svc/gh-proxy -- ps xa|grep soca
      1 ?        Ss     0:00 socat tcp-listen:8080,fork,reuseaddr proxy:proxy.acme.com:ssh.github.com:443,proxyport=8080

git repo url

url: ssh://[email protected]/acme/test-fluxv2.git

and the host entry for github.com

davidkarlsen avatar Jan 28 '21 23:01 davidkarlsen

Thank you for proving this works at the pod infra level.

I like this solution.

Seems like it would be odd/complex to add this to GitRepository.spec as well as all the other Sources...

It's very coupled to the network details of the Cluster's environment -- very much an admin concern

Multiple tenants could transparently be using this proxy, and it would still make sense. (and be safe?)

User-facing Source configs between airgapped and cloud clusters can be the same or similar without being polluted with proxy details.

stealthybox avatar Jan 29 '21 01:01 stealthybox

Using an actual side-car removes a hop to the Service Pod and reduces config, because you don't need the svc IP, it's just localhost. This could stop unintentional cross-zone or inter-rack traffic.

The downside is that we don't run all of the flux controllers in the same process or Pod, so any controllers that need to talk directly to GitHub need their own socat side-car, whereas the Service Pod is reusable, even beyond Flux (ex: your Tekton jobs write artifacts or commit back to the repo -- they can use the proxy too)

stealthybox avatar Jan 29 '21 02:01 stealthybox

Seems like it would be odd/complex to add this to GitRepository.spec as well as all the other Sources...

Yes, I don't think it belongs in the spec, since it is controller-scope'd (we redefine the host) - so two different gitrepo cr's will get the same treatment, given the same target host.

It's very coupled to the network details of the Cluster's environment -- very much an admin concern

Indeed, ideally it belongs in something like Istio (or even more ideally it's dealt with at infra networking level outside of the cluster handling it transparently - sadly we're not there/I can't influence that decision as of now). Actually there are recent merges into Envoy to support this, but it's going to take time to land in Istio, if it ever does.

Multiple tenants could transparently be using this proxy, and it would still make sense. (and be safe?)

It should be safe, it's an opt-in new endpoint (opt-in by defining /etc/hosts entries), and all we are doing is providing an alternative path to an existing public endpoint on the internet. Additionally one can apply NetworkPolicies if wanted.

User-facing Source configs between airgapped and cloud clusters can be the same or similar without being polluted with proxy details.

Exactly, the last step is just howto be able to define hostAliases for the source-controller w/o hacking on the deployment post-intstall time, potentially loosing the config on upgrades etc. One would then follow the same pattern for other sources, like HelmRepos etc, however, we can proxy these as an alternative.

davidkarlsen avatar Jan 29 '21 11:01 davidkarlsen

Using an actual side-car removes a hop to the Service Pod and reduces config, because you don't need the svc IP, it's just localhost. This could stop unintentional cross-zone or inter-rack traffic.

That's fixable by applying NetworkPolicies ?

The downside is that we don't run all of the flux controllers in the same process or Pod, so any controllers that need to talk directly to GitHub need their own socat side-car, whereas the Service Pod is reusable, even beyond Flux (ex: your Tekton jobs write artifacts or commit back to the repo -- they can use the proxy too)

How/for what reasons do the other controllers talk to github? (API?)

davidkarlsen avatar Jan 29 '21 11:01 davidkarlsen

How/for what reasons do the other controllers talk to github?

The image automation controller writes back to Git, see https://toolkit.fluxcd.io/guides/image-update/#configure-image-updates

stefanprodan avatar Jan 29 '21 11:01 stefanprodan

so... how do we proceed from here?

davidkarlsen avatar Feb 03 '21 23:02 davidkarlsen

Using Azure Devops repos also fails when behind a corporate proxy using the libgit2 implementation. Is the expectation that the libgit2 implementation would utilise an HTTPS_PROXY env value or should the source-controller set the proxy details itself somehow? Looking at the libgit2 code there is an option to specify the proxy details when fetching (https://github.com/libgit2/git2go/blob/master/remote.go#L143) however the source-controller doesn't set those options by the looks of it. Is this a feature that should be added to the libgit2 implementation in the source-controller and proxy details added to the GitRepository.spec?

racdev avatar Mar 16 '21 14:03 racdev

Any update on this ?

takirala avatar Aug 06 '21 15:08 takirala

I ended up forking and adding the required code to specify a proxy to use when using libgit2 via a new proxy field added to the GitRepository resource schema. Works for us behind a corporate proxy. If I could use GitHub from work I would raise a PR but unfortunately I can't.

racdev avatar Aug 10 '21 08:08 racdev

In order to get the source controller to access a github.com repository from within my client's corporate network I needed to add the following environmental variables to the source-controller deployment.

- name: HTTPS_PROXY
  value: http://http.proxy.abc.com:8000
- name: NO_PROXY
  value: 10.0.0.0/8

Maybe gotk install could be enhanced to generate this?

Adding the Proxies as ENV Variable to the Source Controller Deploy solved the problem for me.

(Cluster sits behind cooperate proxy; got a timeout - context deadline exceeded error when reconciling the repo with the cluster) Thank you!

ThemeIT avatar Jan 10 '22 08:01 ThemeIT

In order to get the source controller to access a github.com repository from within my client's corporate network I needed to add the following environmental variables to the source-controller deployment.

- name: HTTPS_PROXY
  value: http://http.proxy.abc.com:8000
- name: NO_PROXY
  value: 10.0.0.0/8

Maybe gotk install could be enhanced to generate this?

Adding the Proxies as ENV Variable to the Source Controller Deploy solved the problem for me.

(Cluster sits behind cooperate proxy; got a timeout - context deadline exceeded error when reconciling the repo with the cluster) Thank you!

Hi ThemeIT, You added proxy entries dynamically to the source-controller deployment after running OR you added to the deployment.yaml file, then did Flux deployment ?.

cheruvu1 avatar Jan 25 '22 19:01 cheruvu1

Closing this as PR#524 was merged and added proxy options from version v0.21.0 onwards.

Happy to reopen in case there something still missing.

pjbgf avatar Mar 29 '22 12:03 pjbgf

@pjbgf PR#524 appears to only work for git access over HTTPS and not for proxying SSH with ProxyCommand/socat.

Am I misreading, or would that be a separate issue?

alphabet5 avatar Apr 19 '22 16:04 alphabet5

@pjbgf to use socat-sidecar in a source-controller pod, where source-controller does not run as a root and filesystem is readonly, the setup does not suffice. A co-located pod running socat container requires source-controller to know its ip (to alias github.com) which is problematic for maintaining that relationship reliable. How much effort is it to support .ssh/config and ProxyCommand?

sofib-form3 avatar Jun 24 '22 08:06 sofib-form3

Reopening this due to the lack of documentation for proxying SSH connections.

@sofib-form3 @alphabet5 have you tried exporting the socks5 information via ALL_PROXY as in: ALL_PROXY=socks5://127.0.0.1:1080? In my initial tests, this seems to be supported by the Managed Transport when using libgit2 and I would assume the same to be the case for go-git.

@sofib-form3 on the socat perspective, I will give it a try and let you know. But in theory, you could override those settings (readonly fs and running as non root) in the source-controller YAML with your kustomization.yaml.

pjbgf avatar Jun 27 '22 18:06 pjbgf

hi, any news on this issue? We also need ssh proxyCommand or jump functionaliy. It seems this usecase is pretty common in big organizations.

DudyZ avatar Aug 07 '22 16:08 DudyZ

@sofib-form3 the socat target IP can be set via hostAliases, without requiring the controller pod to lose the readonly fs with:

      - op: add
        path: /spec/template/spec/hostAliases
        value:
          - ip: "127.0.0.1"
            hostnames:
            - "proxy.server"

I think this could be combined with var substitution or similar functionality from helm, in case your proxy service is remote and you need to gather that information querying some Kubernetes objects.

@DudyZ Further to my comment above, proxyCommand is not supported in Flux. Instead, SSH proxy support is done via ALL_PROXY which has been documented as part of https://github.com/fluxcd/website/pull/1083.

I will be closing this as we have the behaviour documented, and outstanding features (e.g. https://github.com/fluxcd/flux2/issues/3007) being tracked on their own independent issues.

pjbgf avatar Aug 25 '22 15:08 pjbgf