GitRepository source-controller fails to fetch files due to removal of a symlink
When using FluxCD source-controller to fetch a GitRepository from our GitLab server, one file (in our case, a file targeted by a relative symlink) is missing in the artifact, even though the targeted file is not deleted. I can consistently reproduce the issue with source-controller 1.5.0+. The issue doesn't exist with source-controller 1.4.1.
Steps to reproduce
I have created a minimal reproduction case with this repository : https://gitlab.com/xavier.francois/bug-source-controller-symlink
To reproduce the issue on Flux you need to have a repo with a main branch containing a file, and a relative symlink pointing to this file. Then you need to create a branch deriving from this commit that removes the symlink. If you create a Gitrepository targetting the new commit, the targetted file won't appear.
---
apiVersion: v1
kind: Namespace
metadata:
name: test-source-controller
---
apiVersion: source.toolkit.fluxcd.io/v1
kind: GitRepository
metadata:
name: bug-source-controller
namespace: test-source-controller
spec:
interval: 1m0s
url: https://gitlab.com/xavier.francois/bug-source-controller-symlink.git
ref:
commit: cd64a30a6c63299abeac585f45e8755e0473765d # commit id of delete-sylink branch, without symlink
# commit: fa03052f8a5573d7215d86444f6128bf63a9dc8e # commit id of main, with symlink
Then if you create a little script that create a kind cluster, install flux, apply the gitrepo, and list the content extracted in the source-controller, you will see that the targeted file is not here anymore
kind delete clusters test-source-controller
kind create cluster --name test-source-controller
kubectl apply -f https://github.com/fluxcd/flux2/releases/download/v2.5.0/install.yaml
kubectl apply -f gitrepo.yaml
kubectl wait --for=condition=Ready gitrepo/bug-source-controller -n test-source-controller --timeout=300s
SOURCE_POD=$(kubectl get pods -n flux-system -l app=source-controller -o jsonpath='{.items[0].metadata.name}')
kubectl exec -n flux-system $SOURCE_POD -- sh -c "cd /tmp && rm -rf commit-extract && mkdir commit-extract"
kubectl exec -n flux-system $SOURCE_POD -- sh -c "cd /tmp/commit-extract && tar -xzf /data/gitrepository/test-source-controller/bug-source-controller/*.tar.gz"
kubectl exec -n flux-system $SOURCE_POD -- sh -c "ls -liah /tmp/commit-extract"
kubectl exec -n flux-system $SOURCE_POD -- sh -c "ls -liah /tmp/commit-extract/target.txt"
If you do the same thing with flux 2.4.0 (source-controller 1.4.1), the issue doesn't arrise.
kind delete clusters test-source-controller
kind create cluster --name test-source-controller
kubectl apply -f https://github.com/fluxcd/flux2/releases/download/v2.4.0/install.yaml
kubectl apply -f gitrepo.yaml
kubectl wait --for=condition=Ready gitrepo/bug-source-controller -n test-source-controller --timeout=300s
SOURCE_POD=$(kubectl get pods -n flux-system -l app=source-controller -o jsonpath='{.items[0].metadata.name}')
kubectl exec -n flux-system $SOURCE_POD -- sh -c "cd /tmp && rm -rf commit-extract && mkdir commit-extract"
kubectl exec -n flux-system $SOURCE_POD -- sh -c "cd /tmp/commit-extract && tar -xzf /data/gitrepository/test-source-controller/bug-source-controller/*.tar.gz"
kubectl exec -n flux-system $SOURCE_POD -- sh -c "ls -liah /tmp/commit-extract"
kubectl exec -n flux-system $SOURCE_POD -- sh -c "ls -liah /tmp/commit-extract/target.txt"
Additional context
The issue could be related to go-git handling of repositories with certain historical objects or refs that exist on the GitLab server.
I guess Go-Git or Go-Billy don't handle correctly dangling commits with relative symlink.
What changed between source-controller 1.4.1 and 1.5.0 that can have an impact is :
- Go-Git was upgraded from 5.12.0 to 5.13.2
- Go-Billy was upgraded from 5.5.0 to 5.6.2
The culprit is likely go-billy, it may be related to this issue
Workaround
Specifying sparseCheckout GitRepository (with Flux 2.6+) parameter with the folder containing the targetted file workarounds the issue.
sparseCheckout:
- charts
- kustomize-units
I am now able to reproduce the issue, so I have updated the issue title and content. It's probably related to this go-billy issue : https://github.com/go-git/go-billy/issues/135
The conditions to reproduce the issue are :
- Having the main branch with a target file and a relative symlink pointing to this target file
- Create a new branch from this main branch, and push a commit that removes the symlink
- Create a FluxCD GitRepository that targets the newly created commit, the target won't be there
From my understanding, when, FluxCD GitRepository is using a commit, it firsts checkout the main branch, then checkouts the specific commit. Due to the go-billy bug that seems to remove the target instead of the symlink in this specific case, the target is now absent of the new commit. The issue doesn't arrise when GitRepository targets a branch because it directly checkouts the branch, it doesn't first checkout the main branch then switch to the branch like in the commit case.