image-reflector-controller
image-reflector-controller copied to clipboard
Image repository port pruned from reflection request after 1000+ images
Hello all,
I'm experiencing an issue where an ImageRepository fails to find the latest tags when there are more than 1000 image tags in the repo. We run our docker repository over https on port 5000 (not 443/80). What seems to happen is when the image reflector requests the latest tags with the n=1000
tag, the request does not contain the port.
All image repositories which have less than 1000 tags do not cause an error, so I presume still include the port.
We are running Nexus3 OSS repo manager (v 3.22) in case it makes a difference.
Image repository:
---
apiVersion: image.toolkit.fluxcd.io/v1beta1
kind: ImageRepository
metadata:
name: not-working-image-repository
namespace: flux-system
spec:
image: <repo>:5000/<not-working-image>
interval: 1m0s
secretRef:
name: docker-registry-credentials-image-automator
Note that working image repos seem to differ in ONLY the image name.
Error:
{
"level": "error",
"ts": "2023-02-16T14:30:05.582Z",
"msg": "Reconciler error",
"controller": "imagerepository",
"controllerGroup": "image.toolkit.fluxcd.io",
"controllerKind": "ImageRepository",
"ImageRepository": {
"name": "not-working-image-repository",
"namespace": "flux-system"
},
"namespace": "flux-system",
"name": "not-working-image-repository",
"reconcileID": "8c492911-547a-4077-b438-fae9ec0f3b71",
"error": "GET https://<repo>/v2/<not-working-image>/tags/list?last=not-working-tag-b1ee0122-462089352&n=1000: unexpected status code 404 Not Found: \n<!DOCTYPE html>\n<html lang=\"en\">\n<head>\n <title>404 - Nexus Repository Manager</title><SNIP.. 404 html..>",
"stacktrace": "sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\tsigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:329\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\tsigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:274\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\tsigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:235"
}
Image repository which works:
curl "https://<repo>:5000/v2/<working-image>/tags/list" | jq '.tags | length'
391
Image repository which does not work:
curl "https://<repo>:5000/v2/<not-working-image>/tags/list" | jq '.tags | length'
1039
Great work on flux! We use it every day at work and it has been pretty solid for the last year!
Going through our code, this is likely an upstream bug in e.g. github.com/google/go-containerregistry
as we do not do much more than calling https://github.com/fluxcd/image-reflector-controller/blob/ff67fcd0b14a5b89d98fe9a6ba5e88542a74d467/controllers/imagerepository_controller.go#L358
While I do think this should be supported (and fixed), I think it is worth mentioning that solutions like GHCR do not allow querying over 1K tags in general.
Thinking more about this, have you checked if Nexus3 OSS by any chance does not just provide a wrong "next" URL in the JSON output and/or response headers if you limit to 1000
? I remember issues with Helm repository indexes with such solutions (Artifactory IIRC) which used to (or are still advertising) "wrong" URLs when a port was configured.
This is a good point. We are running Nexus3 behind a reverse proxy, which handles the 5000 port bits. When I limit the number of tags returned to 100 or so I can see the Link
header being sent with the repo url minus the port.
< Link: <https://<repo>/v2/<not-working-image>/tags/list?n=100&last=<last-tag>>; rel="next"