heim icon indicating copy to clipboard operation
heim copied to clipboard

Why iterate process when calling heim_process::get

Open roblabla opened this issue 3 years ago • 2 comments

On windows, heim_process::get(pid) will iterate on all pids to check if the pid exists:

  • Get calls pid_exists: https://github.com/heim-rs/heim/blob/master/heim-process/src/sys/windows/process/mod.rs#L237).
  • pid_exists calls pids if the process is not STILL_ACTIVE: https://github.com/heim-rs/heim/blob/master/heim-process/src/sys/windows/pids.rs#L15

This ends up being a significant performance bottleneck in my application in certain scenarios - epsecially if I try to get information about processes that may already be dead.

To give a bit more info: I get notified when the process start, and try to get the CWD as fast as possible. This is inherently racy and sometimes the process will die before I get the chance to acquire it (it happens often for short-lived processes like ls in bash for windows). When this happens, Heim will sometimes end up enumerating the pids, which slows down my app and makes it even less likely I'll get notified in time of future processes.

In practice, I don't quite understand the logic behind this enumeration. In my mind, we can do this two ways:

  • Since we have an exit code, that means the process already died and checking the pids is just extra work that doesn't actually tell us much.
  • In addition, actually checking the exit code doesn't actually feel super useful: If we managed to acquire a ProcessHandle, then the process is still available for querying information. Maybe we shouldn't error out at all, even if the process is already dead, so long as we can acquire a ProcessHandle for it?

roblabla avatar May 27 '21 09:05 roblabla

Hey, @roblabla!

It was done to provide better guarantees for pid_exists function. Since current implementation is pretty much a copy of psutil (Python one) library, you can find more details on it here: https://github.com/giampaolo/psutil/pull/1094

svartalf avatar May 27 '21 14:05 svartalf

PSUtil doesn't call pid_exists when getting a process by pid though. E.G. in psutil, the following:

process = Process(512)
cwd = process.cwd()

Will never call pid_exists as far as I can tell. pid_exists is only called in the wait function, and in the send_signal function, but only for OpenBSD.

In heim, this equivalent code will call pid_exists (and thus may call pids):

let process = heim_process::get(512);
let cwd = process.cwd()

This feels unnecessary. It doesn't actually improve reliability since the process may die between the get and the cwd functions, and cwd (and all other functions on Process) already handle the case where the process dies on us by returning an error.

When the user tries to get a Process, we shouldn't check if the pid exists (beyond raising an error if getting the create_time fails) IMO.

roblabla avatar May 27 '21 16:05 roblabla