flux2 icon indicating copy to clipboard operation
flux2 copied to clipboard

TFS on-pem: Flux authentication error after upgrading

Open llanos1205 opened this issue 1 year ago • 6 comments

Describe the bug

We use an on premise azure devops instance and a EKS cluster, we installed flux when the installer was on version 0.28 and worked perfectly. Now we need to use OCI repos and for that we need to upgrade flux, when doing so all git sources no longer work and all of them throw 401 error

We have this secret to for the git source

apiVersion: v1
kind: Secret
metadata:
  name: flux-system
  namespace: flux-system
data:
  password: xxx
  username: xxxx
type: Opaque

the git source

apiVersion: source.toolkit.fluxcd.io/v1beta2
kind: GitRepository
metadata:
 
  labels:
    kustomize.toolkit.fluxcd.io/name: flux-system
    kustomize.toolkit.fluxcd.io/namespace: flux-system
  name: flux-system
  namespace: flux-system
spec:
  gitImplementation: libgit2
  interval: 1m0s
  ref:
    branch: main
  secretRef:
    name: flux-system
  timeout: 1m0s
  url: >-
    http://ourprivatedevopsurl.com:8081/tfs/path/to/repo/_git/repo

All i get is

failed to checkout and determine revision: unable to fetch-connect to remote 'http://ourprivatedevopsurl.com:8081/tfs/path/to/repo/_git/repo': unhandled HTTP error 401 Unauthorized

I see the newer source controller image is ghcr.io/fluxcd/source-controller:v0.28.0 and the last working image for us (the one installed when we installed flux quite a while ago) is ghcr.iofluxcd/source-controller:v0.24.4

did something break changed there?

Steps to reproduce

1.- Install flux with client 0.28.0 2.-upgrade gotk-components with kustomize build https://github.com/fluxcd/flux2/manifests/install?ref=main | kubectl apply -f-

Expected behavior

the update should only add the oci provider without breaking anything else, all git sources should still work

Screenshots and recordings

No response

OS / Distro

Ubuntu 20.04

Flux version

v0.33.0

Flux check

► checking prerequisites ✔ Kubernetes 1.23.3 >=1.20.6-0 ► checking controllers ✔ helm-controller: deployment ready ► ghcr.io/fluxcd/helm-controller:v0.23.1 ✔ image-automation-controller: deployment ready ► ghcr.io/fluxcd/image-automation-controller:v0.24.2 ✔ image-reflector-controller: deployment ready ► ghcr.io/fluxcd/image-reflector-controller:v0.20.1 ✔ kustomize-controller: deployment ready ► ghcr.io/fluxcd/kustomize-controller:v0.27.1 ✔ notification-controller: deployment ready ► fluxcd/notification-controller:v0.25.2 ✔ source-controller: deployment ready ► ghcr.io/fluxcd/source-controller:v0.28.0 ► checking crds ✔ alerts.notification.toolkit.fluxcd.io/v1beta1 ✔ buckets.source.toolkit.fluxcd.io/v1beta2 ✔ gitrepositories.source.toolkit.fluxcd.io/v1beta2 ✔ helmcharts.source.toolkit.fluxcd.io/v1beta2 ✔ helmreleases.helm.toolkit.fluxcd.io/v2beta1 ✔ helmrepositories.source.toolkit.fluxcd.io/v1beta2 ✔ imagepolicies.image.toolkit.fluxcd.io/v1beta1 ✔ imagerepositories.image.toolkit.fluxcd.io/v1beta1 ✔ imageupdateautomations.image.toolkit.fluxcd.io/v1beta1 ✔ kustomizations.kustomize.toolkit.fluxcd.io/v1beta2 ✔ ocirepositories.source.toolkit.fluxcd.io/v1beta2 ✔ providers.notification.toolkit.fluxcd.io/v1beta1 ✔ receivers.notification.toolkit.fluxcd.io/v1beta1 ✔ all checks passed

Git provider

on premise azure devops

Container Registry provider

No response

Additional context

No response

Code of Conduct

  • [X] I agree to follow this project's Code of Conduct

llanos1205 avatar Aug 31 '22 20:08 llanos1205

@llanos1205 I have been unable to reproduce this Azure DevOps (hosted).

Can you check that your token hasn't been revoked or had its permission changed?

somtochiama avatar Sep 01 '22 00:09 somtochiama

@llanos1205 I have been unable to reproduce this Azure DevOps (hosted).

Can you check that your token hasn't been revoked or had its permission changed?

we're using the same user and credentials in other 2 working clusters (both installed with the client in 0.28.0) I know is not a credentials problem as when I rollback the changes everything works again (but I lose the new OCI source)

llanos1205 avatar Sep 01 '22 03:09 llanos1205

@llanos1205 I can only reproduce the error message you are getting against DevOps in two ways:

  • Using http instead of https: this might just be a SaaS thing. But can you confirm you have authenticated against the http endpoint of a on-premises DevOps before?
  • The PAT has expired: the default expiration is 30 days if I am not mistaken.

pjbgf avatar Sep 01 '22 10:09 pjbgf

@llanos1205 I can only reproduce the error message you are getting against DevOps in two ways:

  • Using http instead of https: this might just be a SaaS thing. But can you confirm you have authenticated against the http endpoint of a on-premises DevOps before?
  • The PAT has expired: the default expiration is 30 days if I am not mistaken.
  • I thought the same about https, created and https endpoint but same behavior
  • currently I use the same credentials with other 2 clusters (both still on 0.28.0 )and they are still working without problems

llanos1205 avatar Sep 01 '22 13:09 llanos1205

found the solution, which seems strange to me. So far up to this point all previous clusters and implementation used plain username and password (and the one on 0.28 are still working) on the flux-system secret, now only after upgrading to 0.33.0 the password MUST to be a PAT and it no longer works with the original user password.

Should this be in the documentation?

llanos1205 avatar Sep 01 '22 15:09 llanos1205

@llanos1205 using a user password is not what we have in the Azure DevOps docs:

If you wish to use Git over HTTPS, then generate a personal access token and supply it as the password

stefanprodan avatar Sep 01 '22 15:09 stefanprodan