chromedp
chromedp copied to clipboard
chromedp-runner temp directory is not always deleted
What versions are you running?
github.com/chromedp/chromedp v0.5.3 Chromium 83.0.4092.0 go version go1.14.1 darwin/amd64
$ go list -m github.com/chromedp/chromedp $ google-chrome --version $ go version
What did you do? Include clear steps.
- Passed new context to
chromedp
and executed actions; relevant invocations:
func GetNumberOfPages(url string) (int, error) {
ctx, _ := chromedp.NewContext(context.Background())
defer chromedp.Cancel(ctx)
var res string
err := chromedp.Run(ctx,
chromedp.Navigate(url),
chromedp.InnerHTML(`:root`, &res, chromedp.NodeVisible),
)
if err != nil {
panic(err)
}
re := regexp.MustCompile(`– (\d+) of (\d+) pages"\ssrc="`)
number, err := strconv.Atoi(strings.Fields(strings.TrimSpace(re.FindString(res)))[3])
return number, err
}
func GeneratePDF(url string, dest string, width float64, height float64) error {
paper_width := (width / 96.0) + 2.0
paper_height := (height / 96.0) + 2.0
// create context
ctx, _ := chromedp.NewContext(context.Background())
defer chromedp.Cancel(ctx)
var pdfReader io.Reader
err := chromedp.Run(ctx, chromedp.Tasks{
chromedp.Navigate(url),
chromedp.WaitReady("svg"),
chromedp.ActionFunc(func(ctx context.Context) error {
buf, _, err := page.PrintToPDF().
WithPaperWidth(paper_width).
WithPaperHeight(paper_height).
WithMarginTop(1.0).
WithMarginBottom(1.0).
WithMarginLeft(1.0).
WithMarginRight(1.0).
WithPageRanges("1").
Do(ctx)
if err != nil {
return err
}
pdfReader = bytes.NewBuffer(buf)
return nil
}),
})
if err != nil {
return err
}
destFile, err := os.Create(dest)
if err != nil {
return err
}
defer destFile.Close()
_, err = io.Copy(destFile, pdfReader)
if err != nil {
return err
}
return nil
}
Both of these functions are run in a single-thread context (no goroutines); not sure if that affects anything.
What did you expect to see?
The GeneratePDF
function is run multiple times in a for
loop, and when I watch my temp folders directory having the chromedp-runner*
temp directory recreated many times. I would expect that at the end of execution, no chromedp-runner*
temp directories remain because the entire cleanup should be contained within the defer chromedp.Cancel(ctx)
call (I have tried both defer cancel()
and defer chromedp.Cancel(ctx)
with the same results).
What did you see instead?
One or more chromedp-runner*
temp directories get left behind with the following structure:
❯ tree /private/var/folders/mc/b5bnhqzj05b67zv7chskx2zw0000gp/T/chromedp-runner028516848
/private/var/folders/mc/b5bnhqzj05b67zv7chskx2zw0000gp/T/chromedp-runner028516848
└── Default
└── Cache
└── index-dir
└── the-real-index
❯ tree /private/var/folders/mc/b5bnhqzj05b67zv7chskx2zw0000gp/T/chromedp-runner069103546
/private/var/folders/mc/b5bnhqzj05b67zv7chskx2zw0000gp/T/chromedp-runner069103546
└── Default
└── Session\ Storage
└── CURRENT
❯ tree /private/var/folders/mc/b5bnhqzj05b67zv7chskx2zw0000gp/T/chromedp-runner760816317
/private/var/folders/mc/b5bnhqzj05b67zv7chskx2zw0000gp/T/chromedp-runner760816317
└── Default
└── Session\ Storage
└── 000003.log
❯ tree /private/var/folders/mc/b5bnhqzj05b67zv7chskx2zw0000gp/T/chromedp-runner765244031
/private/var/folders/mc/b5bnhqzj05b67zv7chskx2zw0000gp/T/chromedp-runner765244031
└── Default
└── Session\ Storage
└── CURRENT
These don't take up a lot of space, but it would be nice if they would not get left behind.
I verified that there were no stray chrome
/ Chromium
processes still running, so I don't believe these are leftovers from orphaned processes left running in the background.
I think I've resolved the issue in my local use-case by simply making all my chromedp
invocations share/reuse a single context rather than creating new ones each time. This single context (and associated temp folder) seems to be getting cleaned up properly.
Can you provide a self-contained main.go file to reproduce this issue?
I'm no longer able to reproduce the issue, at least in the original way that I observed it back in March. The temp folders still accumulate while the program is running but all of the temp folders seem to get cleared away when the program exits. The temp folders are not cleared if the program is interrupted (e.g. via Ctrl + C), although I'm not sure if this is expected/intended behavior.
I believe this was indirectly fixed via 3976e2ae9cebe027f6c2113e446627861aa5acef but I'm not sure.
Test code:
package main
import (
"context"
"github.com/chromedp/chromedp"
)
func main() {
for i := 1; i<=10; i++ {
ctx, _ := chromedp.NewContext(context.Background())
defer chromedp.Cancel(ctx)
chromedp.Run(ctx, chromedp.Tasks{
chromedp.Navigate("https://google.com"),
chromedp.WaitReady("html"),
})
}
}
Is there a way to disable these temp folders from being created at all? We run this library on tens of thousands of agents deployed around the world and cannot always guarantee a graceful shutdown.
I don't think that chrome can run without the user data directory. That said, you can specify the user data directory with chromedp.UserDataDir()
, then chromedp
won't create it (but chrome will write to the specified directory, and chromedp
won't touch it). I'm not sure will this make it better for you.
Since the issue is not longer reproducible, closing.
Please file a new issue with a self-contained reproducer if it happens again.
Hi @ZekeLu ,
This is happening to me, when running constantly it ends up filling more than 50/100 gbs in the tmp directory, leaving my VPS without any memory left and breaking a lot of things. This doesn't happen in my other server running the same thing, not sure what are the reasons of this happening.
If you google "chromedp-runner" there are other people complaining too about the same.
@marcelo321 Thank you for the feedback! Have you tried 0.8.4
? It has shipped the commit fb22a3c9e832e0d18aa3838298552563576a46c9, which hopefully will address the issue.
@ZekeLu I hate I don't get notifications on replies like this :/
So I am importing it like this:
import (
"bufio"
"sync"
"etc.."
"github.com/chromedp/chromedp"
)
and I already run the go get
command to update it, should be set?
so I deleted the binary of my program and build it again apart from doing go get url
to update chromedp, should I do anything else? sorry for the newbi question. is "github.com/chromedp/chromedp"
imported ok?
Run go list -m github.com/chromedp/chromedp
in the root of your module to get the version of chromedp
. If you have run go get
then it should have been upgraded to the latest version.
If you can reproduce the issue, please file a new issue with concrete information on how to reproduce it.