napalm-logs icon indicating copy to clipboard operation
napalm-logs copied to clipboard

napalm-logs consumer gets stuck when "Host" key is missing from msg_dict

Open moogzy opened this issue 5 months ago • 1 comments

Whilst integrating IOS-XR we caused our consumers to get stuck, where they observed the following stack trace:

Process Process-26:
Traceback (most recent call last):
  File "/usr/lib/python3.11/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/usr/lib/python3.11/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/lib/python3/dist-packages/napalm_logs/server.py", line 304, in start
    host=msg_dict["host"],
         ~~~~~~~~^^^^^^^^
KeyError: 'host'

The logging hostnameprefix was not configured on our XR node which meant it fell into default mode of sending logs without the hostname included. This is supported by the IOS-XR log parser as updated in https://github.com/napalm-automation/napalm-logs/pull/171.

Unfortunately the server.py code tries to access msg_dict["host"] directly in a few places. As this doesn't exist for those default style IOS-XR messages, it's not included in the object so the consumer fails and gets stuck at that point.

The problematic lines in napalm-logs that I've found so far:

  • https://github.com/napalm-automation/napalm-logs/blob/develop/napalm_logs/server.py#L304 (this is the line we crashed on today)
  • https://github.com/napalm-automation/napalm-logs/blob/develop/napalm_logs/server.py#L330

Suggested fix:

  1. Add safe access for the value and return "Unknown" for the host value when necessary. That way we are aligning with how unknown hosts are handled in the napalm-logs code.

OR

  1. Explicitly encode hostname: unknown into the IOS-XR logs parser.

I prefer option 1 as that is vendor agnostic and is the safer approach for this code. Option 2 dictates that any other vendors will need to encode unknown as needed rather than the server.py code just handling it safely and avoiding the consumer getting stuck processing.

moogzy avatar May 12 '25 12:05 moogzy