bug: operator pod restarting due to controller-runtime.source errors
Describe the issue
I'm running into an issue building a fluent-operator image using the Makefile, or using the go build ... invocation and packaging the resulting manager binary into an image. Either approach results in pod restarts when deploying the operator.
The errors look like the following:
kubectl logs -n fluent fluent-operator-55dd7bc945-kthgm
Example MultilineParser error using docker build image
2024-05-28T13:21:44Z ERROR controller-runtime.source if kind is a CRD, it should be installed before calling Start {"kind": "MultilineParser.fluentbit.fluent.io", "error": "no matches for kind \"MultilineParser\" in version \"fluentbit.fluent.io/v1alpha2\""}
sigs.k8s.io/controller-runtime/pkg/source.(*Kind).Start.func1.1
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/source/source.go:143
k8s.io/apimachinery/pkg/util/wait.runConditionWithCrashProtectionWithContext
/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:154
k8s.io/apimachinery/pkg/util/wait.waitForWithContext
/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:207
k8s.io/apimachinery/pkg/util/wait.poll
/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/poll.go:260
k8s.io/apimachinery/pkg/util/wait.PollImmediateUntilWithContext
/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/poll.go:200
sigs.k8s.io/controller-runtime/pkg/source.(*Kind).Start.func1
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/source/source.go:136
Example ClusterMultilineParser from go build... image
2024-05-28T14:22:17Z INFO Starting workers {"controller": "fluentbit", "controllerGroup": "fluentbit.fluent.io", "controllerKind": "FluentBit", "worker count": 1}
2024-05-28T14:22:17Z ERROR controller-runtime.source if kind is a CRD, it should be installed before calling Start {"kind": "ClusterMultilineParser.fluentbit.fluent.io", "error": "no matches for kind \"ClusterMultilineParser\" in version \"fluentbit.fluent.io/v1alpha2\""}
sigs.k8s.io/controller-runtime/pkg/source.(*Kind).Start.func1.1
/var/cache/melange/gomodcache/sigs.k8s.io/[email protected]/pkg/source/source.go:143
k8s.io/apimachinery/pkg/util/wait.runConditionWithCrashProtectionWithContext
/var/cache/melange/gomodcache/k8s.io/[email protected]/pkg/util/wait/wait.go:154
k8s.io/apimachinery/pkg/util/wait.poll
/var/cache/melange/gomodcache/k8s.io/[email protected]/pkg/util/wait/poll.go:245
k8s.io/apimachinery/pkg/util/wait.PollImmediateUntilWithContext
/var/cache/melange/gomodcache/k8s.io/[email protected]/pkg/util/wait/poll.go:200
sigs.k8s.io/controller-runtime/pkg/source.(*Kind).Start.func1
/var/cache/melange/gomodcache/sigs.k8s.io/[email protected]/pkg/source/source.go:136
2024-05-28T14:22:18Z INFO Starting workers {"controller": "fluentd", "controllerGroup": "fluentd.fluent.io", "controllerKind": "Fluentd", "worker count": 1}
2024-05-28T14:22:18Z INFO Starting workers {"controller": "collector", "controllerGroup": "fluentbit.fluent.io", "controllerKind": "Collector", "worker count": 1}
2024-05-28T14:22:18Z INFO Starting workers {"controller": "fluentd", "controllerGroup": "fluentd.fluent.io", "controllerKind": "Fluentd", "worker count": 1}
Both methods will show both Cluster/Multiline errors and the pod will restart. The operator will however create fluent-bit and fluentd pods given the appropriate manifest.
To Reproduce
build the image
cat VERSION
v2.8.0
docker build --platform linux/arm64 -f cmd/fluent-manager/Dockerfile . -t k3d-k3d.localhost:5005/fluent-operator:v2.8.0
docker push k3d-k3d.localhost:5005/fluent-operator:v2.8.0
helm install
helm repo add fluent https://fluent.github.io/helm-charts
helm install fluent-operator --create-namespace -n fluent fluent/fluent-operator --set operator.container.repository=k3d-k3d.localhost:5005/fluent-operator --set operator.container.tag=v2.8.0
Expected behavior
The pod should not be restarting and throwing errors about CRDs.
Your Environment
- Fluent Operator version:
v2.8.0 - Container Runtime:
containerd://1.7.15-k3s1 - Operating system:
ID=wolfi
NAME="Wolfi"
PRETTY_NAME="Wolfi"
VERSION_ID="20230201"
HOME_URL="https://wolfi.dev"
- Kernel version:
Linux d186e0ca7a21 6.6.16-linuxkit #1 SMP Fri Feb 16 11:54:02 UTC 2024 aarch64 Linux
How did you install fluent operator?
Using helm:
helm repo add fluent https://fluent.github.io/helm-charts
helm install fluent-operator --create-namespace -n fluent fluent/fluent-operator --set operator.container.repository=k3d-k3d.localhost:5005/fluent-operator --set operator.container.tag=v2.8.0
Additional context
Running with the kubesphere/fluent-operator:v2.8.0 image explicitly, or by not setting the operator.container.repository value in the helm chart results in no errors.
@jamonation have you tried the latest image that will be built on every PR
@jamonation @benjaminhuo I'm also getting the same error. The workaround to move from
container: repository: "kubesphere/fluent-operator" tag: v2.1.0 tag: v2.8.0
Isn't working?
I'll try latest and report back @benjaminhuo, thanks for the help!
This is issue for us too, cannot seem to use 2.8 or 2.9
I dont see being able to use disableComponentControllers: "fluentd" and make use of fluent-bit configurations. This results into
2024-05-28T13:21:44Z ERROR controller-runtime.source if kind is a CRD, it should be installed before calling Start ......
Also relative to:
https://github.com/fluent/fluent-operator/blob/v2.9.0/charts/fluent-operator/templates/fluent-operator-deployment.yaml#L101-L107
The logic is totally changed from v2.7.0
https://github.com/fluent/fluent-operator/blob/v2.7.0/charts/fluent-operator/templates/fluent-operator-deployment.yaml#L102
2.9.0 -> putting as same arg 2.7.0 -> as seperate value for arg