prefect icon indicating copy to clipboard operation
prefect copied to clipboard

Add Optional to Set Failed State From UI

Open ColeMurray opened this issue 1 year ago • 4 comments

Description

When a job ungracefully exits, it currently remains in a zombie running state. Within the UI, there currently is not functionality to manually mark this as failed, requiring a developer to issue commands from comamnd command line to update the state.

Reproduction / Example

As a user, I would like to click a "Mark Failed" button on a given flow run that issues a set_state command to the API, marking the flow run as failed.

ColeMurray avatar Jul 19 '22 03:07 ColeMurray

Thanks for the request @ColeMurray - this is a request we've discussed internally and have been wary of because we somehow need to be clear of the difference between marking a state and expecting changes in run behavior. If you have a minute to give more details of your use case, that would be helpful!

zhen0 avatar Jul 19 '22 20:07 zhen0

Another user comment/request from slack: https://prefect-community.slack.com/archives/C0192RWGJQH/p1659533174627099

Is there or will there be a way to control the states of the flows or tasks in the UI (Pausing a flow mid run or even killing a flow's run instance). I believe that would be very helpful as to be able to control existing flows instead of needing to delete flows and creating new ones.

zhen0 avatar Aug 03 '22 17:08 zhen0

A user on the same slack thread requests mass flow cancellation: https://prefect-community.slack.com/archives/C0192RWGJQH/p1659945900763339?thread_ts=1659533174.627099&cid=C0192RWGJQH

I think it will be if we can mass cancel flow runs in the UI. For example, if I have a deployment scheduled to run every 5 minutes, and after the first run I realize the flow will fail, I would want to cancel the rest of the runs

space-age-pete avatar Aug 08 '22 15:08 space-age-pete

@space-age-pete note this gets at the nuance we are concerned about. We do not support cancellation of running flows at this time. If the flow has crashed and was not successfully marked as so, we can allow it to be marked as crashed, but if you do so while the flow is still running it will not exit until it attempts to send a new state. Scheduled flow runs can be cancelled still.

@ColeMurray could you share an example of ungraceful exit that you're encountering? Undetected crashes are bugs. We've got some ideas for improving this in general, but I'm curious what cases are common for you.

zanieb avatar Aug 08 '22 15:08 zanieb

@madkinsz In my case, this was caused by docker shutting down my container due to resource overuse: https://github.com/PrefectHQ/prefect/issues/6024

A previous failure was discussed here: https://github.com/PrefectHQ/prefect/issues/5780, related to the interpreter crashing.

ColeMurray avatar Aug 12 '22 05:08 ColeMurray