[backend] Frequent Metadata Writer Pod Restarts
Environment
KFP version: v2.5+
Steps to reproduce
Use multiple replica sets of the metadata writer. These will be doing the same tasks and cause unhandled errors around removing debug files (here)
These unhandled errors will cause frequent pod restarts
(This occurs for any error other than ReadTimeoutError errors)
Expected result
The errors should handled and the pod should not restart
Impacted by this bug? Give it a 👍.
I can confirm the behaviour mentioned above.
Our metadata-writer instance keeps restarting when doing housekeeping
'''Traceback (most recent call last):
File "/kfp/metadata_writer/metadata_writer.py", line 185, in