data-infra
data-infra copied to clipboard
Bug: v3 archiver ticker fails to cleanly handle invalid download configs
Describe the bug
Currently the v3 archiver's ticker loads download configs on start and then every 5 minutes (code). If the ticker encounters an error while reading in download configs, an exception is raised and the pod crashes, which can lead to downtime (e.g. if the new download config is simply unparseable).
To Reproduce
A Grafana alert will fire; you could reproduce by uploading a JSON artifact that does not follow the download config pydantic model to the test bucket.
Expected behavior Instead of crashing, the ticker should continue to use the existing download configs in memory. Then, Grafana can alert us when the config age is unacceptably high (this metric/alert already exists).
Additional context n/a