containerd icon indicating copy to clipboard operation
containerd copied to clipboard

TaskOOM event lost

Open lengrongfu opened this issue 2 years ago • 15 comments

Description

https://github.com/kubernetes/kubernetes/issues/119600

When I was troubleshooting the problem that the Reason was not OOMKilled, I found that the container lost the TaskOOM event.

Steps to reproduce the issue

e2e: https://prow.k8s.io/view/gs/kubernetes-jenkins/logs/ci-containerd-node-e2e-1-7/1683184526936772608 logfile: https://storage.googleapis.com/kubernetes-jenkins/logs/ci-containerd-node-e2e-1-7/1683184526936772608/artifacts/tmp-node-e2e-92a26bb7-cos-97-16919-353-1-system.log containerd_id: 98bac6c4b73cfd4c647647aa80248208b67de059874d4ddfdd1aa43ea3967e54 containerd_name: oomkill-multi-target-container

  1. kernel having trigger oom
Jul 23 18:56:46 tmp-node-e2e-92a26bb7-cos-97-16919-353-1 kernel: oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=cri-containerd-98bac6c4b73cfd4c647647aa80248208b67de059874d4ddfdd1aa43ea3967e54.scope,mems_allowed=0,oom_memcg=/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod17efb6ec_09ab_4eec_8ad6_f953eb1a7466.slice,task_memcg=/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod17efb6ec_09ab_4eec_8ad6_f953eb1a7466.slice/cri-containerd-98bac6c4b73cfd4c647647aa80248208b67de059874d4ddfdd1aa43ea3967e54.scope,task=dd,pid=38287,uid=0
Jul 23 18:56:46 tmp-node-e2e-92a26bb7-cos-97-16919-353-1 kernel: Memory cgroup out of memory: Killed process 38287 (dd) total-vm:21724kB, anon-rss:14744kB, file-rss:508kB, shmem-rss:0kB, UID:0 pgtables:68kB oom_score_adj:997
Jul 23 18:56:46 tmp-node-e2e-92a26bb7-cos-97-16919-353-1 kernel: Tasks in /kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod17efb6ec_09ab_4eec_8ad6_f953eb1a7466.slice/cri-containerd-98bac6c4b73cfd4c647647aa80248208b67de059874d4ddfdd1aa43ea3967e54.scope are going to be killed due to memory.oom.group set
  1. and then containerd exit shim
Jul 23 18:56:46 tmp-node-e2e-92a26bb7-cos-97-16919-353-1 containerd[910]: time="2023-07-23T18:56:46.329304474Z" level=debug msg="received exit event container_id:\"98bac6c4b73cfd4c647647aa80248208b67de059874d4ddfdd1aa43ea3967e54\" id:\"98bac6c4b73cfd4c647647aa80248208b67de059874d4ddfdd1aa43ea3967e54\" pid:38265 exit_status:137 exited_at:{seconds:1690138606 nanos:325950236}"
Jul 23 18:56:46 tmp-node-e2e-92a26bb7-cos-97-16919-353-1 containerd[910]: time="2023-07-23T18:56:46.389869167Z" level=info msg="shim disconnected" id=98bac6c4b73cfd4c647647aa80248208b67de059874d4ddfdd1aa43ea3967e54 namespace=k8s.io
Jul 23 18:56:46 tmp-node-e2e-92a26bb7-cos-97-16919-353-1 containerd[910]: time="2023-07-23T18:56:46.390142692Z" level=warning msg="cleaning up after shim disconnected" id=98bac6c4b73cfd4c647647aa80248208b67de059874d4ddfdd1aa43ea3967e54 namespace=k8s.io

  1. After that, no TaskOOM event was received until container delete.

Describe the results you received and expected

Expect to receive TaskOOM event.

What version of containerd are you using?

1.7

Any other relevant information

ubuntu

Show configuration if it is related to CRI plugin.

no

lengrongfu avatar Jul 30 '23 15:07 lengrongfu

@Iceber is the issue you mentioned before? Let's say that init command is sh -c "oom-process". The "oom-process" causes the oom event but the sh -c doesn't. So, the containerd only records the sh -c exit code instead of sub-process's.

fuweid avatar Jul 31 '23 16:07 fuweid

Discuss with @Iceber and think that the shim process exits too quickly, causing the shim not to receive the OOM event.

My idea is that the current watch oom s.ep.Add is executed after the process starts container.Start, whether we can put it before the start.

https://github.com/containerd/containerd/blob/40f26543bdc27cbe8b058ac082e91c5832bb1c41/runtime/v2/runc/task/service.go#L263-L273

lengrongfu avatar Aug 01 '23 07:08 lengrongfu

Discuss with @Iceber and think that the shim process exits too quickly, causing the shim not to receive the OOM event.

It is also possible that the cgroup was added to the oom watcher after the container was started, and this was when the process was already oom and the cgroup watcher lost the oom event

Iceber avatar Aug 01 '23 07:08 Iceber

/assign

lengrongfu avatar Aug 03 '23 05:08 lengrongfu

Discuss with @Iceber and think that the shim process exits too quickly, causing the shim not to receive the OOM event.

It is also possible that the cgroup was added to the oom watcher after the container was started, and this was when the process was already oom and the cgroup watcher lost the oom event

Can it be the same as this than: https://github.com/containerd/containerd/issues/8180?

SergeyKanzhelev avatar Aug 22 '23 20:08 SergeyKanzhelev

Discuss with @Iceber and think that the shim process exits too quickly, causing the shim not to receive the OOM event.

It is also possible that the cgroup was added to the oom watcher after the container was started, and this was when the process was already oom and the cgroup watcher lost the oom event

Can it be the same as this than: #8180?

looks like a problem.

lengrongfu avatar Aug 26 '23 15:08 lengrongfu

type TaskOOM struct {
	state         protoimpl.MessageState
	sizeCache     protoimpl.SizeCache
	unknownFields protoimpl.UnknownFields

	ContainerID string `protobuf:"bytes,1,opt,name=container_id,json=containerId,proto3" json:"container_id,omitempty"`
}

find ContainerID value is not ContainerID, is SandboxID.

  1. faild test log: https://storage.googleapis.com/kubernetes-jenkins/logs/ci-containerd-node-e2e-1-7/1697683704735862784/artifacts/tmp-node-e2e-b9a730eb-ubuntu-gke-2204-1-24-v20220623-system.log
  • recive TaskOOM event, butt container_id value 4bcba7fc4369fee4d7d2e6263b58fce52cfc928c6a31cfac4d43c838708991ce is error, but it is SandboxID.
$ Sep 01 19:12:33 tmp-node-e2e-b9a730eb-ubuntu-gke-2204-1-24-v20220623 containerd[1472]: time="2023-09-01T19:12:33.012922721Z" level=info msg="TaskOOM event container_id:\"4bcba7fc4369fee4d7d2e6263b58fce52cfc928c6a31cfac4d43c838708991ce\""

$ Sep 01 19:12:34 tmp-node-e2e-b9a730eb-ubuntu-gke-2204-1-24-v20220623 containerd[1472]: time="2023-09-01T19:12:34.733787645Z" level=info msg="StopPodSandbox for \"4bcba7fc4369fee4d7d2e6263b58fce52cfc928c6a31cfac4d43c838708991ce\""
  1. success test log: https://storage.googleapis.com/kubernetes-jenkins/logs/ci-containerd-node-e2e-1-7/1696958640436547584/artifacts/tmp-node-e2e-d1698486-cos-97-16919-353-23-system.log
  • recive TaskOOM event, containerd_id is right.
$ Aug 30 19:08:12 tmp-node-e2e-d1698486-cos-97-16919-353-23 containerd[922]: time="2023-08-30T19:08:12.422392986Z" level=info msg="CreateContainer within sandbox \"5df73d21cde36f876f4692a23354ec6e8858851f8fcd2ad20844e7efb7d4d98c\" for &ContainerMetadata{Name:oomkill-single-target-container,Attempt:0,} returns container id \"7c19542bf60d2091b242cc5997afc8224601e0f3f4758385c3e257b27c214ac3\""

$ Aug 30 19:08:18 tmp-node-e2e-d1698486-cos-97-16919-353-23 containerd[922]: time="2023-08-30T19:08:18.386243530Z" level=info msg="TaskOOM event container_id:\"7c19542bf60d2091b242cc5997afc8224601e0f3f4758385c3e257b27c214ac3\""

lengrongfu avatar Sep 05 '23 06:09 lengrongfu

xlink with some CRI-O worka around this: https://github.com/kubernetes/kubernetes/pull/120460#issuecomment-1708794531

SergeyKanzhelev avatar Sep 06 '23 17:09 SergeyKanzhelev

We have also hit this, I can reproduce relatively easily via:

ctr run --memory-limit $((500 * 1024 * 1024)) --mount type=tmpfs,options=size=1G,dst=/tester docker.io/library/ubuntu:latest foo /bin/sh -c 'cat /dev/urandom > /tester/foo'

tych0 avatar Nov 21 '23 22:11 tych0

Hi, @lengrongfu. I'm Dosu, and I'm helping the containerd team manage their backlog. I'm marking this issue as stale.

Issue Summary:

  • The issue involves a missing TaskOOM event in a Kubernetes container, where the termination reason was not marked as OOMKilled.
  • Discussions suggest the shim process may exit too quickly, or the cgroup was added to the OOM watcher too late.
  • SergeyKanzhelev linked a similar issue (#8180) and mentioned a workaround in CRI-O.
  • Tych0 provided a method to reproduce the issue.
  • You identified a discrepancy where the TaskOOM event's ContainerID was incorrectly set to SandboxID.

Next Steps:

  • Please confirm if this issue is still relevant to the latest version of the containerd repository by commenting on this issue.
  • If there is no further activity, the issue will be automatically closed in 7 days.

Thank you for your understanding and contribution!

dosubot[bot] avatar Jun 11 '25 16:06 dosubot[bot]

This is still happening for containerd 2.1.4 and it doesn't seem to be fixed based on the last 2.1.5 release notes.

patrickshan avatar Dec 03 '25 02:12 patrickshan

can you reproduce it @patrickshan

ningmingxiao avatar Dec 03 '25 03:12 ningmingxiao

https://github.com/containerd/containerd/pull/12614 this pr maybe can reduce the probability of it happening. @patrickshan

ningmingxiao avatar Dec 03 '25 04:12 ningmingxiao

We’ve observed this behavior occasionally in our cluster. After further investigation, we found there was an 8-minute delay between when the TaskOOM event occurred and when containerd actually received it, due to the node being under some load at the time. This appears to be a different issue from the one your PR addresses.

patrickshan avatar Dec 03 '25 05:12 patrickshan

func (s *service) oomEvent(id string) {
	send := make(chan struct{})
	go func() {
		s.shutdown.RegisterCallback(func(context.Context) error {
			// Wait oomEvent publish done.
			<-send
			return nil
		})
	}()
 // add some log 
	err := s.publisher.Publish(s.context, runtime.TaskOOMEventTopic, &eventstypes.TaskOOM{
		ContainerID: id,
	})
 // add some log 
	if err != nil {
		log.G(s.context).WithError(err).Error("post event")
	}
	close(send)
}

you can add some log before and after s.publisher.Publish to address who use much time @patrickshan

ningmingxiao avatar Dec 03 '25 06:12 ningmingxiao