luigi
luigi copied to clipboard
KubernetesJobTask fails because waiting state reason is 'PodInitializing'
I've been trying to use the Kubernetes Job wrapper, but I am facing a task failure, even though the Job executes just fine after it has been spun up.
Runtime error:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/luigi/worker.py", line 193, in run
new_deps = self._run_get_new_deps()
File "/usr/local/lib/python3.10/site-packages/luigi/worker.py", line 133, in _run_get_new_deps
task_gen = self.task.run()
File "/usr/local/lib/python3.10/site-packages/luigi/contrib/kubernetes.py", line 391, in run
self.__track_job()
File "/usr/local/lib/python3.10/site-packages/luigi/contrib/kubernetes.py", line 224, in __track_job
while not self.__verify_job_has_started():
File "/usr/local/lib/python3.10/site-packages/luigi/contrib/kubernetes.py", line 304, in __verify_job_has_started
assert wr == 'ContainerCreating', "Pod %s %s. Logs: `kubectl logs pod/%s`" % (
AssertionError: Pod fetch-20220720152731-be35e0c148e747aa-wjprm PodInitializing. Logs: `kubectl logs pod/fetch-20220720152731-be35e0c148e747aa-wjprm`
It looks like PodInitializing is one of the reasons for a pod to be in waiting state, although I could not find any documentation stating that it is (what i found is this: https://github.com/kubernetes/kube-state-metrics/blob/4090e8b7aa39afcfe4d5e62d3f3c7262e09409b9/docs/pod-metrics.md). The only state checked that is not a failure state is ContainerCreating though:
https://github.com/spotify/luigi/blob/afa6ba30b1acd45eba2a273f20a0e81f6e8da48b/luigi/contrib/kubernetes.py#L304
Seems like an easy fix, I'd be happy to do it.