datafusion-ballista icon indicating copy to clipboard operation
datafusion-ballista copied to clipboard

Scheduler infinite loop after failed/canceled job

Open andygrove opened this issue 3 years ago • 4 comments

Describe the bug See PR description at https://github.com/apache/arrow-ballista/pull/340

To Reproduce See PR description at https://github.com/apache/arrow-ballista/pull/340

Expected behavior A clear and concise description of what you expected to happen.

Additional context Add any other context about the problem here.

andygrove avatar Oct 11 '22 03:10 andygrove

Ah, this is a known issue, actually I added the failed job check in the pop_next_task() loop. Without such check, the scheduler loop will try to schedule the pending tasks from failed job which would be worse !!

@yahoNanJing Please take a look and have a fix.

mingmwang avatar Oct 12 '22 16:10 mingmwang

Thanks @andygrove and @tfeda for reporting this issue. I'll try to fix it.

yahoNanJing avatar Oct 13 '22 02:10 yahoNanJing

Any update? This will directly cause the ui to be unavailable.

smallzhongfeng avatar Aug 31 '23 07:08 smallzhongfeng