aws-node-termination-handler icon indicating copy to clipboard operation
aws-node-termination-handler copied to clipboard

panic: There was a problem checking for spot ITNs: Metadata request received http status code: 401

Open vikas-gautam opened this issue 3 years ago • 3 comments
trafficstars

Describe the bug panic: There was a problem checking for spot ITNs: Metadata request received http status code: 401 Getting this error from node termination handler pod

Steps to reproduce A step-by-step description on how to reproduce the problem.

Expected outcome Expecting to fetch successful response from IMDS service to handle various events.

Application Logs

2022/08/08 11:42:55 INF Started watching for interruption events 2022/08/08 11:42:55 INF Kubernetes AWS Node Termination Handler has started successfully! 2022/08/08 11:42:55 INF Started watching for event cancellations 2022/08/08 11:42:55 INF Started monitoring for events event_type=SPOT_ITN 2022/08/08 11:42:55 INF Started monitoring for events event_type=SQS_TERMINATE 2022/08/08 11:43:01 WRN There was a problem monitoring for events error="There was a problem checking for spot ITNs: Metadata request received http status code: 401" event_type=SPOT_ITN 2022/08/08 11:43:05 WRN There was a problem monitoring for events error="There was a problem checking for spot ITNs: Metadata request received http status code: 401" event_type=SPOT_ITN 2022/08/08 11:43:09 WRN There was a problem monitoring for events error="There was a problem checking for spot ITNs: Metadata request received http status code: 401" event_type=SPOT_ITN 2022/08/08 11:43:13 WRN There was a problem monitoring for events error="There was a problem checking for spot ITNs: Metadata request received http status code: 401" event_type=SPOT_ITN 2022/08/08 11:43:13 WRN Stopping NTH - Duplicate Error Threshold hit. panic: There was a problem checking for spot ITNs: Metadata request received http status code: 401

goroutine 53 [running]: main.main.func3(0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...) /node-termination-handler/cmd/node-termination-handler.go:210 +0x649 created by main.main /node-termination-handler/cmd/node-termination-handler.go:192 +0xd51

Environment: Non prod

  • NTH App Version: v1.16.5
  • NTH Mode (IMDS/Queue processor): Queue processor
  • OS/Arch: amazon linux version2
  • Kubernetes version: 1.21
  • Installation method: applied yaml using kubectl

vikas-gautam avatar Aug 08 '22 11:08 vikas-gautam

Hello, Could you share the configs you used to install NTH?

I would not expect to see the logs "There was a problem checking for spot ITNs" in Queue Processor mode because that means NTH is polling IMDS for spot_itn event instead of receiving the event in SQS

brycahta avatar Aug 11 '22 20:08 brycahta

We also had same issue. I had configured to use Queue Processor by setting enableSqsTerminationDraining = true with zero spot instance.

Then to solve the above issue I had to explicitly set enableSpotInterruptionDraining = false

looks like even if enableSqsTerminationDraining = true node termination handler is still considering the value of enableSpotInterruptionDraining

zchandrahasan avatar Aug 25 '22 16:08 zchandrahasan

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If you want this issue to never become stale, please ask a maintainer to apply the "stalebot-ignore" label.

github-actions[bot] avatar Sep 24 '22 17:09 github-actions[bot]

This issue was closed because it has become stale with no activity.

github-actions[bot] avatar Sep 29 '22 17:09 github-actions[bot]