crawlee
crawlee copied to clipboard
Graceful abort of the runtime process (emulation of Apify platform feature)
Motivation Apify platform provides a nice feature to gracefully abort an actor run. Instead of exiting the process right away, a user can choose to abort gracefully which makes the Apify platform:
- Immediately emit the
aborting
(andpersistState
?) event - Wait ± 10 seconds
- Kill the process from the outside
Crawlee already supports handling of these events but unfortunately, outside of the Apify platform, there is no easy way to trigger the aborting
event. Supporting this feature would complete the powerful "resumability" feature of actors/crawlers which enables you to seamlessly exit and then resurrect a run at any point.
Describe the feature To be honest, I don't have a good idea of how to implement this for Crawlee/CLI. One way I imagine this is that sending a SIGINT command via CTRL + C (or by other means) would not kill the process immediately but instead follow the same flow that Apify platform does (aborting event -> 10 seconds -> exit).
You can implement this behavior from inside the called process like this (also with an added feature to exit immediately on second SIGINT). This behaves like expected on Mac but didn't test on Windows.
let terminationInitiated = false;
process.on('SIGINT', () => {
if (!terminationInitiated) {
log.warning(`Received process termination command. Will persist state and terminate in 10 seconds. Press CTRL + C for immediate exit`);
setTimeout(process.exit, 10_000)
terminationInitiated = true;
events.emit('aborting')
} else {
log.warning(`Received second process termination command. Will terminate immediately.`);
process.exit(0);
}
});
node
REPL process has a similar behavior. When you start via node
and then press CTRL + C, it doesn't exit but prints (To exit, press Ctrl+C again or Ctrl+D or type .exit)
Constraints
- Question is if we want to do a breaking change and replace the current default SIGINT behavior or possibly implement this with a different command.
- It might be tricky to support all major OSs properly.