chromedp icon indicating copy to clipboard operation
chromedp copied to clipboard

Hanging Chromium processes

Open aymericbeaumet opened this issue 4 years ago • 16 comments

What versions are you running?

$ go list -m github.com/chromedp/chromedp
github.com/chromedp/chromedp v0.6.5
$ chromium --version
Chromium 89.0.4388.0
$ go version
go version go1.15.7 darwin/amd64

What did you do? Include clear steps.

I'm running this simple program:

func main() {
	ctx, cancel := chromedp.NewContext(context.Background())
	defer cancel()

	if err := chromedp.Run(
		ctx,
		chromedp.Navigate("https://github.com"),
	); err != nil {
		panic(err)
	}
}

What did you expect to see?

The Chromium processes should be killed when the Go process stops.

What did you see instead?

The number of Chromium processes grows after each time I run the Go program.

aymericbeaumet avatar Feb 12 '21 23:02 aymericbeaumet

I can second this behavior. I have written a screenshot service based on chromedp (https://github.com/mkalus/goggler) and experience the same problem when running the Docker image.

Try the following:

docker run -d --rm -p8080:8080 --name goggler ronix/goggler
# before:
docker exec goggler ps -Af
# get image
curl -o /dev/null http://localhost:8080/?url=https://www.google.com/
# after
docker exec goggler ps -Af

The last column will look something like that:

UID          PID    PPID  C STIME TTY          TIME CMD
root           1       0  0 13:58 ?        00:00:00 /opt/google/chrome/goggler
root          28       1  0 13:59 ?        00:00:00 [cat] <defunct>
root          29       1  0 13:59 ?        00:00:00 [cat] <defunct>
root          31       1  0 13:59 ?        00:00:00 [chrome] <defunct>
root          32       1  0 13:59 ?        00:00:00 [chrome] <defunct>
root          44       1  0 13:59 ?        00:00:00 [chrome] <defunct>
root          50       1  0 13:59 ?        00:00:00 [chrome] <defunct>
root          72       0  0 14:01 ?        00:00:00 ps -Af

I cannot get rid of the zombies without killing the parent process which shuts down the container of course.

mkalus avatar Feb 16 '21 14:02 mkalus

@mkalus I've written a blog post explaining how to mitigate the issue: https://aymericbeaumet.com/prevent-chromedp-chromium-zombie-processes-from-stacking.

TLDR

func main() {
	ctx, cancel := chromedp.NewContext(context.Background())
	defer func() {
		cancel()
		// Prevent Chromium processes from hanging
		if _, err := exec.Command("pkill", "-g", "0", "Chromium").Output(); err != nil {
			log.Println("[warn] Failed to kill Chromium processes", err)
		}
	}()

	// ...
}

aymericbeaumet avatar Feb 17 '21 20:02 aymericbeaumet

I have read your blog post and tried your code, but in my case (within the Docker container), the zombie processes cannot be killed without killing the main process (PID 1). Moreover, just killing off Chromiums would do harm since multiple go routines might have spawned Chromium processes and just killing all will lead to errors.

As a consequence, I am looking for another solution, hopefully one which can be done in the code.

mkalus avatar Feb 17 '21 21:02 mkalus

Thanks to you @aymericbeaumet, I have had another look and found a solution described in https://github.com/chromedp/docker-headless-shell

I need to initialize my container using dumb-init or tini to get rid of zombie processes. Thanks for pushing me to think again ;-)

mkalus avatar Feb 17 '21 22:02 mkalus

Issue #774 should've been given as a comment here.

ghost avatar Mar 21 '21 06:03 ghost

So I ran into the same issue today. I don't really liked the pkill approach for my own use case, so I started to look into the code, specifically how chromedp discover browser exec path.

While I was looking at this I realized that in OSX, when you install chromium through brew, chromium's cask also deploy a small wrapper in /usr/local/bin/chromium. This is this wrapper that is discovered by chromedp findExecPath function. Thing is, the way the wrapper is made, it start a shell with chromium as child process. When context is done, the shell is terminated, leaving an orphan chromium process on the system.

I just submitted this PR to the brew project, which solve this behaviour and I hope will be accepted as it solve the issue at its source.

If you are running into this issue with this same setup ... it is likely the root cause and you don't have to use pkill, the problem is not on chromedp neither on chromium.

Until then, I ended implementing my own findExecPath and feed the browser path at context creation. I'm wondering what the project think about this and if something like this should be implemented in the main library :

func findExecPath() string {
	var p []string
	switch runtime.GOOS{
	case "darwin":
		p = []string{
                         // Mac
			"/Applications/Chromium.app/Contents/MacOS/Chromium",
			"/Applications/Google Chrome.app/Contents/MacOS/Google Chrome",
		}
	case "windows":
		p = []string{
			// Windows
			"chrome",
			"chrome.exe", // in case PATHEXT is misconfigured
			`C:\Program Files (x86)\Google\Chrome\Application\chrome.exe`,
			`C:\Program Files\Google\Chrome\Application\chrome.exe`,
			filepath.Join(os.Getenv("USERPROFILE"), `AppData\Local\Google\Chrome\Application\chrome.exe`),
		}
	default:
		p = []string{
			// Unix-like
			"headless_shell",
			"headless-shell",
			"chromium",
			"chromium-browser",
			"google-chrome",
			"google-chrome-stable",
			"google-chrome-beta",
			"google-chrome-unstable",
			"/usr/bin/google-chrome",
		}
	}

	for _, path := range p {
		found, err := exec.LookPath(path)
		if err == nil {
			return found
		}
	}
	// Fall back to something simple and sensible, to give a useful error
	// message.
	return "google-chrome"
}

I can provide a PR if that's something you would be interested in. Thanks for this fantastic library !
Cheers

fabio42 avatar May 01 '21 00:05 fabio42

@fabio42 Good job! A PR is always welcome! Even if homebrew fixed its issue, this change will reduce the call to exec.LookPath. The concern is that it may break some use cases (for example, a Windows user who just has chromium installed). But I think it's okay since a user can always specify the browser path with chromedp.ExecPath.

And please note that this is just one of the reasons that the browser processes do not terminated. The root cause is different from that of zombies in a container.

Thank you!

ZekeLu avatar May 01 '21 02:05 ZekeLu

Thank you for your feedback @ZekeLu. Just opened a PR #811 for this change.

I agree the container issue looks indeed different. My understanding is that behavior is expected, the container seems to handle ENTRYPOINT/CMD as the init process by default (https://docs.docker.com/config/containers/multi-service_container/). As @mkalus mentioned the simplest way to address it is to use the Docker provided --init option.

fabio42 avatar May 03 '21 19:05 fabio42

Hello,

Some users have reported memory leaks due to accumulating Chrome processes.

While I'm not entirely certain about the root cause, I'm hoping someone can shed some light on this.

Context: Linux container, Chrome v115.x, chromedp v0.9.1, amd64 platform.

  1. Zombies:

    We use tini as PID 1 to reap zombies processes. Furthermore, chromedp waits for the command to complete, so having zombies processes is strange.

    However, and please correct me if I'm mistaken, I believe there might be an exception when the context concludes. In this case, we might not wait for the command to finalize (be killed), leading to indefinitely hanging processes.

    For instance, adding cmd.Wait here could make the trick:

    select {
     case <-ctx.Done():
     	// TODO: do we care about this error in any scenario? if the
     	// user cancelled the context and killed chrome, this will most
     	// likely just be "signal: killed", which isn't interesting.
     	go cmd.Wait()
    
     	return nil, ctx.Err()
     case <-c.allocated: // for this browser's root context
    }
    
  2. Hanging Processes:

    This might correlate with the previous point. Some users have observed hanging processes that aren't zombies.

On a side note, I'm questioning whether cmd.SysProcAttr.Pdeathsig = syscall.SIGKILL is sufficient in a Linux setting. For context, an (older) blog post: https://medium.com/@felixge/killing-a-child-process-and-all-of-its-children-in-go-54079af94773 suggests an alternative approach:

 cmd := exec.Command("/bin/sh", "-c", "watch date > date.txt")
 cmd.SysProcAttr = &syscall.SysProcAttr{Setpgid: true}
 // ...
 syscall.Kill(-cmd.Process.Pid, syscall.SIGKILL)

In both scenarios, chrome_crashpad and cat processes are the culprits.

gulien avatar Aug 05 '23 08:08 gulien

Hi @gulien, as of now, starting and closing browser instances frequently has some known issues:

  • zombie processes left in the system (this issue)
  • leaked files. See:
    • #1105
    • #1332

And it consumes more resources comparing to opening and closing a tab in an existing browser instance.

So, for now, I would recommend using a single browser instance. chromedp.NewContext shows how to use a single browser instance for multiple tasks.

ZekeLu avatar Aug 08 '23 03:08 ZekeLu

Thanks @ZekeLu! The recommended way used to be the other way around lol

gulien avatar Aug 10 '23 10:08 gulien