rust-clippy icon indicating copy to clipboard operation
rust-clippy copied to clipboard

new lint: `zombie_processes`

Open y21 opened this issue 1 year ago • 15 comments

Closes #10754

Lint description should explain this PR, so I think there's nothing else to say here ^^

changelog: new lint: [zombie_processes]

y21 avatar Sep 10 '23 14:09 y21

r? @Centri3

(rustbot has picked a reviewer for you, use r? to override)

rustbot avatar Sep 10 '23 14:09 rustbot

:umbrella: The latest upstream changes (presumably #11413) made this pull request unmergeable. Please resolve the merge conflicts.

bors avatar Sep 12 '23 16:09 bors

Out of curiosity, does aborting the program also finally release these "zombies" in any way? If so, I think we could check for an unconditional panic!, exit, abort, etc., after it.

I ask because I was creating a modloader for a game, it'd be loaded then reopen the program with a different entry point before aborting the current program (to run our own code after the loader finishes loading DLLs, or something. No idea why I didn't use function hooking, probably wanted to feel clever ^^). Not really the greatest example, but if the parent process closing allows these zombies to terminate properly, I think it makes sense to check for this.

Regardless though, this is a nice lint and I'll review it soon.

Centri3 avatar Sep 13 '23 23:09 Centri3

On waitpid's man page there's this note:

If a parent process terminates, then its "zombie" children (if any) are adopted by init(8), which automatically performs a wait to remove the zombies.

So yes, I guess that releases them :) I also did a quick experiment and I can see that this does indeed happen:

Experiment details

main.rs:

use std::process::Command;

fn main() {
  for _ in 0..5 {
    let _ = Command::new("echo").spawn().unwrap(); // <- no wait()
  }
  println!("Zombies spawned. Press enter to release them");
  std::io::stdin().read_line(&mut String::new()).unwrap();
}

This doesn't have an explicit exit() call, but this is all at the end of a main function and I think that'd have the same effect.

$ rustc main.rs
$ ./main
Zombies spawned. Press enter to release them

# at this point, run `ps aux` in another session and look for 5 `[echo] <defunct>` entries (zombies). 
#
# <enter>
# 
# now that the process has exited, run `ps aux` again and see that there are no <defunct> entries

(This seems kinda specific to linux, idk if this is needed at all on Windows...)

If so, I think we could check for an unconditional panic!, exit, abort, etc., after it.

By this I assume you mean that we shouldn't lint if there is one of these after spawning the process. I agree with exit and abort, but I'm not sure about panic!, since you can catch those and so it's not always guaranteed that the process terminates (which means that the zombie processes might remain after the panic)

y21 avatar Sep 14 '23 16:09 y21

TIL :)

Platform-specific lints are fine. For example, non_octal_unix_permissions is basically a nop on windows (or other platforms) since you can't actually access the method it checks. There's another similar to this in #11062 too. wait-ing is also usually what the programmer intended even if they're on Windows (I can't really think of many reasons you wouldn't)

Catching a panic is only possible on -Cpanic=unwind and should really only be reserved to FFI (or otherwise wherever unwinding is silly, like inside the panic handler). You use this to stop unwinding, to instead abort the program immediately. In the case somebody actually uses this as a try-catch, this is probably a reasonable FN (and we should probably have a lint for this as well)

We could probably extend this to anything that is unconditionally ! (which catch_unwind wouldn't be)

Centri3 avatar Sep 14 '23 16:09 Centri3

Yeah I agree catching panics really shouldnt be done for error handling à la try-catch, but panics are also per thread, so you can panic on one thread and the others (the rest of the program) will live on, and I would say a large number of rust programs involve threads in some way. Tokio also uses catch_unwind so that stuff like webservers don't go down when a panic occurs. So imo a panic! is quite often not something that can stop the process (and therefore not always something that removes zombies), but if you think we should still do that anyway, I'm also fine with that.

Another question re. the proposed behavior: would this look at just the next statement after the spawn or how far are we willing to lookahead for an exit? All the way to the end of the enclosing block? Would it consider exit() in an if too? I imagine this would become kinda complex if we want to look across multiple statements, since some statements can diverge or exit only in an "unlikely" case

Catching a panic is only possible on -Cpanic=unwind

Also to me this makes it sound like that's a rare thing to have but it's the default value ^^, so you have to manually set panic=abort to get it to not unwind

We could probably extend this to anything that is unconditionally ! (which catch_unwind wouldn't be)

Not sure on this, since expressions like return, continue, loop {} etc. are also ! but they don't necessarily stop the process (code could be getting into an intentional infinite loop that processes inputs in a loop forever or something like that)

y21 avatar Sep 14 '23 18:09 y21

Yeah I was wrong, always forget ! extends to anything that is never constructed ^^

Also forgot panic! is specific to a thread; I think that's best to be propagated to other threads but yeah, we should still lint it in case of panic!s because of that

Couple thoughts on the behavior:

  1. If it's the final call expression in the main function, it shouldn't lint since returning from main will (ignoring #[start]) effectively immediately exit the process.
  2. It shouldn't consider conditional exits, even something like if true
  3. Any call between the exit and the spawn should be pessimistically considered always diverging, i.e., it'll never return

Stuff like loop expressions should ofc be considered infinitely looping. I think this doesn't have to extend to for loops though, as you'd have to intentionally make an iterator that never advances for that to loop forever.

Centri3 avatar Sep 18 '23 13:09 Centri3

:umbrella: The latest upstream changes (presumably #11556) made this pull request unmergeable. Please resolve the merge conflicts.

bors avatar Sep 26 '23 16:09 bors

Had to rebase because the PR was getting too far behind. The last commit should contain the changes for (most of) the review comments. Also had to split the test up into two files since some of them have suggestions now

y21 avatar Jan 30 '24 23:01 y21

:umbrella: The latest upstream changes (presumably #12306) made this pull request unmergeable. Please resolve the merge conflicts.

bors avatar Feb 19 '24 09:02 bors

I'll put this on my todo list for the next week :D

@y21 would you mind rebasing on master once again?

r? @xFrednet

xFrednet avatar Mar 31 '24 11:03 xFrednet

:umbrella: The latest upstream changes (presumably #12107) made this pull request unmergeable. Please resolve the merge conflicts.

bors avatar May 12 '24 16:05 bors

Sorry for the wait, am a bit busy again, I'll try to get to the rest of this in 2-3 weeks

@rustbot author

y21 avatar May 24 '24 22:05 y21

Hey @y21 this is another careful ping. Could you look at the last comment? The changes should be simple IIRC

xFrednet avatar Jun 30 '24 20:06 xFrednet

After having thought about this more for a while, I'm not sure if some of the logic in here is really necessary/worth the complexity.

For example, checking if the .spawn() call is the last expression in main is a fair bit of code for what I think is fairly unlikely to occur in practice.

The logic implies that you .spawn() a process and don't do anything with it (if you do, then it won't be the last statement of the program), which on its own doesn't make a lot of sense, because (without waiting) the host process will likely exit before the child process gets to do anything.

It could work if the spawned process finishes quickly enough, but I'd even go as far as saying that it probably should warn on that, exactly because it isn't waiting on the subprocess and the current process might exit prematurely depending on how long it takes. (Then the problem doesn't really have to do with zombie processes anymore, but it would still encourage adding a .wait())

y21 avatar Jun 30 '24 21:06 y21

This sounds reasonable to me, feel free to remove that part. If this really comes up as a comment, we can always add it as a limitation to the lint description. :)

xFrednet avatar Jul 02 '24 17:07 xFrednet

:umbrella: The latest upstream changes (presumably #12873) made this pull request unmergeable. Please resolve the merge conflicts.

bors avatar Jul 04 '24 05:07 bors

@xFrednet The FCP thread hasn't gotten comments but 4 upvote reactions and it's been about a month -- do you think we're good? I guess it's also waiting on me to do a final rebase.

y21 avatar Aug 18 '24 18:08 y21

Ahh, I totally forgot about this Pr, yes we can count the FCP as accepted. You can Roses = me after a rebase :D

xFrednet avatar Aug 18 '24 21:08 xFrednet