cdk-github-runners icon indicating copy to clipboard operation
cdk-github-runners copied to clipboard

Ec2 has not been terminated

Open pharindoko opened this issue 1 year ago • 4 comments

Hey,

I have this rare case that an ec2 is still available for days.

  • Systemcheck shows that the instance is not reachable anymore.
  • Github workflow succeeded
  • Stepfunction workflow succeeded

aws console: image image

system-log image

Anyone had the same issue ?

pharindoko avatar Feb 29 '24 14:02 pharindoko

Anything of interest in /var/log/cfn-cmd.log, /var/log/cfn-cmd-init.log and the output of dmesg?

kichik avatar Feb 29 '24 14:02 kichik

Anything of interest in /var/log/cfn-cmd.log, /var/log/cfn-cmd-init.log and the output of dmesg?

Not really - could I see some of that stuff in the cloudwatch logs. I'm unable to connect to the instance via ssm.

pharindoko avatar Feb 29 '24 23:02 pharindoko

Those logs do not go in CloudWatch.

It's strange that the step function completed but the system log doesn't show the aws stepfunctions send-task-success call. Whenever I see something like this, I immediately assume OOM. That's why I asked for dmesg. But it's very strange for that to happen so fast that the log would only show one heartbeat (should be one every minute) and nothing else. Even the runner log itself seems truncate. Is it just as terse in CloudWatch?

kichik avatar Mar 01 '24 01:03 kichik

I'll check what I can find in cloudwatch. It's a super rare case but appeared now twice in one week. Maybe I need an additional watchdog. This is why I'm asking if anyone else had that problem.

pharindoko avatar Mar 01 '24 06:03 pharindoko

Related to #537

pharindoko avatar Apr 24 '24 15:04 pharindoko