runbooks icon indicating copy to clipboard operation
runbooks copied to clipboard

Bubble up more information about Pod status in Model/Notebook/Server APIs

Open nstogner opened this issue 2 years ago • 1 comments

A common case will be that a Pod requires a GPU but the scheduler/autoscaler is unable to place the Pod on a Node that has one. We shouldnt require users to go digging through Pod statuses and events to find this information. I think it should be shown with kubectl get models (and get notebooks, get modelservers) under the CONDITION column... see for reference:

k get models
NAME                READY   CONDITION
facebook-opt-125m   True    BuiltAndPushed
my-model            True    BuiltAndPushed

nstogner avatar Jun 27 '23 19:06 nstogner

Information about whether something is pending could probably come from the Job API. Perhaps a kubectl plugin would be better suited for further drill down… ie events and pod statuses (in order to avoid overactive reconcile loops and API updates).

nstogner avatar Jun 30 '23 12:06 nstogner