sky cancel not reliable for program using multiprocessing and ray
We have a script that uses ray to schedule multiple workers on the VM, and each worker can run some multiprocessing ILP solver. When I sky cancel the job, it is unreliable to kill the jobs. I have to manually log into the cluster and kill the processes and sometimes, I have to ray stop and sky launch again.
This issue is stale because it has been open 120 days with no activity. Remove stale label or comment or this will be closed in 10 days.
This is still relevant and a related issue is #2340
This issue is stale because it has been open 120 days with no activity. Remove stale label or comment or this will be closed in 10 days.
This issue was closed because it has been stalled for 10 days with no activity.
This issue is stale because it has been open 120 days with no activity. Remove stale label or comment or this will be closed in 10 days.
This issue was closed because it has been stalled for 10 days with no activity.