kwatch
kwatch copied to clipboard
Ignore Failed Graceful Shutdown not working
Describe the bug Sounds the ignore failed graceful shutdown feature not working correctly since last versions. It was fine before before 0.9
Sounds some consequences of refactoring done in https://github.com/abahmed/kwatch/pull/280 in particular https://github.com/abahmed/kwatch/blob/main/filter/containerKillingFilter.go
To Reproduce Scale down some deployment of app unable to stop in the allowed grace period.
Expected behavior No alert if pods killed after grace period of a normal cluster behavior (scaling, rearrangement,..)
Actual behavior
- kubelet log :
I0621 09:54:32.178488 1975 kuberuntime_container.go:742] "Killing container with a grace period" pod="google-sync-app-master/hutch-65d56f4988-9vw5f" podUID=1caade92-2bb7-4542-a0e0-acdee0df6c47 containerName="hutch-container" containerID="containerd://4031df8df9362d4df69ace1af956eb7430fa0b4e819f4059c54c790e28a2bd61" gracePeriod=30
- kwatch log triggering notif :
{"level":"info", "msg":"sending event: {PodName:hutch-65d56f4988-9vw5f ContainerName:hutch-container Namespace:google-sync-app-master Reason:Error Events:[2024-06-21 09:54:32 +0000 UTC] Killing Stopping container hutch-container Logs:Docker Starting hutch in hutch-start.sh
ENVKEY_ENV: preprod
PING rabbitmq.rabbitmq.svc.cluster.local:15672
RabbitMQ is UP!
2024-06-19T08:30:08Z 19 INFO -- writing pid in /home/effilab/tmp/hutch.pid
2024-06-19T08:30:08Z 19 INFO -- hutch booted with pid 19
2024-06-19T08:30:08Z 19 INFO -- found rails project (.), booting app in preprod environment
Labels:map[app:google-sync-app-master pod-template-hash:65d56f4988 role:hutch]}"}
Version/Commit All fine before 0.9 Notification not triggered with the 0.9 but could be a consequence of others bug fixed in subsequent releases like 0.9 logs give these sorts of logs : {"level":"info","msg":"container only issue nginx tag-xy-6ff64687c7-zsmb4 tag-xy-6ff64687c7 Error 137","time":"2024-06-21T14:53:19Z"}