dask-kubernetes
dask-kubernetes copied to clipboard
Add status to `DaskJob` CRD
Would it be possible to track the status of a job in the toplevel DaskJob
CR? This would have the advantage of hiding the "implementation details" of the job from a user. I.e. instead of having to know about the state of a job-runner
pod that gets created as part of a DaskJob
, one could only look at the DaskJob
resource to get that information.
A particular use case here would be the Flyte plugin I am currently working on. Most Flyte (backend) plugins work by creating a k8s resource which they continuously poll for updates. The way the plugin machinery is implemented does not allow to reach out to arbitrary k8s resources as the initially created resource is updated and passed into the plugin from the outside.
The exact interface definition is here, where the resource
argument in Plugin.GetTaskPhase()
would correspond to the DaskJob
CR created in Plugin.BuildResource()
Similar Job objects do also set a Status
field on their Custom Resources, e.g.:
Happy to contribute something in case we can align on a format of the status. My initial proposal would be:
type JobStatus string
const (
JobStatusPending JobStatus = "PENDING"
JobStatusRunning JobStatus = "RUNNING"
JobStatusStopped JobStatus = "STOPPED"
JobStatusSucceeded JobStatus = "SUCCEEDED"
JobStatusFailed JobStatus = "FAILED"
)
type DaskJobStatus struct {
DaskClusterName string `json:"daskClusterName,omitempty"`
JobStatus JobStatus `json:"jobStatus,omitempty"`
StartTime *metav1.Time `json:"startTime,omitempty"`
EndTime *metav1.Time `json:"endTime,omitempty"`
}
This comment has a more detailed explanation on why the current state is blocking the Flyte plugin development.
cc @hamersaw
Yeah this would be great. We already do this for the DaskCluster
resource so adding it for the DaskJob
makes sense too.
If you have an interest in contributing it that would be fantastic.