pipelines icon indicating copy to clipboard operation
pipelines copied to clipboard

[backend] Frequent Metadata Writer Pod Restarts

Open JerT33 opened this issue 1 month ago • 1 comments

Environment

KFP version: v2.5+

Steps to reproduce

Use multiple replica sets of the metadata writer. These will be doing the same tasks and cause unhandled errors around removing debug files (here)

These unhandled errors will cause frequent pod restarts

(This occurs for any error other than ReadTimeoutError errors)

Expected result

The errors should handled and the pod should not restart


Impacted by this bug? Give it a 👍.

JerT33 avatar Nov 21 '25 17:11 JerT33

I can confirm the behaviour mentioned above.

Our metadata-writer instance keeps restarting when doing housekeeping

'''Traceback (most recent call last): File "/kfp/metadata_writer/metadata_writer.py", line 185, in os.remove(debug_paths.popleft()) FileNotFoundError: [Errno 2] No such file or directory: '/tmp/pod_s7xxx'''

AleMScof avatar Dec 01 '25 15:12 AleMScof