fluent-operator
fluent-operator copied to clipboard
Helm Chart: uninstall not completing cleanly
Describe the bug
After installing the fluent-operator (1.0.0) via Helm and then uninstalling it, some resources aren't removed cleanly and become stuck in the terminating state. I can understand that CRDs aren't deleted by Helm (see #132), but I still have to mess with the other resources in order to get things fully removed removed. This shouldn't be necessary.
To Reproduce
helm upgrade --install fluent-operator --create-namespace -n logging https://github.com/fluent/fluent-operator/releases/latest/download/fluent-operator.tgz -f fluent-operator-values_sandbox.yaml
With fluent-operator-values_sandbox.yaml containing:
#Set this to containerd or crio if you want to collect CRI format logs
containerRuntime: docker
Kubernetes: true
operator:
resources:
limits:
cpu: 100m
memory: 50Mi
requests:
cpu: 10m
memory: 20Mi
fluentbit:
# fluentbit resources. If you do want to specify resources, adjust them as necessary
#You can adjust it based on the log volume.
resources:
limits:
cpu: 500m
memory: 200Mi
requests:
cpu: 10m
memory: 25Mi
#Set a limit of memory that Tail plugin can use when appending data to the Engine.
#If the limit is reach, it will be paused; when the data is flushed it resumes.
#if the inbound traffic is less than 2.4Mbps, setting memBufLimit to 5MB is enough
#if the inbound traffic is less than 4.0Mbps, setting memBufLimit to 10MB is enough
#if the inbound traffic is less than 13.64Mbps, setting memBufLimit to 50MB is enough
input:
tail:
memBufLimit: 5MB
fluentd:
enable: true
replicas: 1
forward:
host: "logging.svc"
watchedNamespaces:
- kube-system
resources:
limits:
cpu: 1000m
memory: 256Mi
requests:
cpu: 20m
memory: 64Mi
#Configure the output plugin parameter in Fluentd.
#You can set enable to true to output logs to the specified location.
output:
es:
enable: true
host: logs1.local.loc
port: 9200
logstashPrefix: k8s_logs_sb
buffer:
enable: true
type: file
path: /buffers/es
(but it occurs even with much simpler config with only fluent-bit enabled - with es output)
The following uninstall (without adding any our own fluent-operator related resources in-between the install & uninstall) isn't performed cleanly so I have to perform something like the following:
helm uninstall fluent-operator -n logging
kubectl patch -n logging FluentBit fluent-bit -p '{"metadata":{"finalizers":[]}}' --type=merge
kubectl delete crd "$(kubectl get crd -o jsonpath='{.items[?(@.spec.group==\"fluentd.fluent.io\")].metadata.name}')"
kubectl delete crd "$(kubectl get crd -o jsonpath='{.items[?(@.spec.group==\"fluentbit.fluent.io\")].metadata.name}')"
kubectl delete namespace -n logging
Ignoring CRDs (as mentioned above), I still have to clear finalizers of the FluentBit type resource fluent-bit, otherwise it would remain stuck in the terminating state.
Expected behavior
The FluentBit resource fluent-bit would be cleanly removed during Helm uninstall.
Your Environment
- Fluent Operator version: Helm Chart 1.0.0
- Container Runtime: docker, Kubernetes 1.20.15
How did you install fluent operator?
helm upgrade --install fluent-operator --create-namespace -n logging https://github.com/fluent/fluent-operator/releases/latest/download/fluent-operator.tgz -f fluent-operator-values_sandbox.yaml
What happened?
No response
Your Error Log
N/A
Additional context
No response
This is the reason for helm. The reason is that helm unloads each cr in an unordered way, so it causes a certain CR to get stuck. fluent-operator
does the relevant processing when deleting fluentd
or fluentbit
, you can refer to these codes.
https://github.com/fluent/fluent-operator/blob/master/controllers/fluent_controller_finalizer.go
So the problem is in the CR uninstall order? You may try the workaround with List: https://github.com/helm/helm/issues/8439#issuecomment-1068979423 https://github.com/kubernetes/kubernetes/blob/master/hack/testdata/list.yaml However I'm not sure if uninstall would respect that (in reverse order), perhaps not.
Other than waiting for the https://github.com/helm/helm/issues/8439 to be implemented, perhaps a pre-delete Helm chart hook may help to at least make sure everything gets deleted.
Yes, it might be possible to add a preprocessing mechanism to determine the order of uninstallation of helm, which might be a new feature.
I have the same question,when i echo " helm install fluent-operator fluent-operator -f fluent-operator/values.yaml ",i can get fluent-bit & fluentd & fluent-operator pods in running.But when i echo "helm uninstall fluent-operator ",only fluent-operator pods are removed,fluent-bit & flunentd pods still in running.Then i echo " helm install fluent-operator fluent-operator -f fluent-operator/values.yaml " again,fluent-bit & flunentd pods will be deleted. I think it's so weird that all the pods about fluent should be removed on the first helm uninstall.
I have the same issue and the reason is fluent-bit
fluentbit CRD resource.
For forcing deletion you need to remove its finalizer
section
$ kubectl edit fluentbit fluent-bit -n fluent