gowebdav icon indicating copy to clipboard operation
gowebdav copied to clipboard

memleak during multiple file upload when authentication is required

Open ilyalavrinov opened this issue 6 years ago • 8 comments

Hello Collaborators,

Describe the bug A short description of what you think the bug is.

Software

  • OS: Linux Mint 19 Tara
  • Golang: go version go1.10.1 linux/amd64
  • Version: latest studio-b12/gowebdav (loaded via go get)

To Reproduce I've started to build a small tool which would allow file syncing via WebDAV. It's not fully ready but at least it allows to scan a folder and create a list of files which should be uploaded. The flow of the program now is the following:

  1. Create list of files for uploading
  2. Create N goroutines, each has its own gowebdav.Client
  3. Filenames for uploading are sent to these goroutines via channels (fan-out), then each file is uploaded independently via Client.WriteStream. The file is closed via defer.
  4. I used WebDAV API for Yandex.Disk, but I believe the same problem could be reproduced with any server with configured Basic authentication.

What I observe is enormous RAM usage by my tool (e.g. even if a folder contains files for 1G, it can easily take up to several gigabytes of RAM). I've tried to narrow the scope of the problem and here's what I've got:

  • GODEBUG=gctrace=1 shows that upon each gc cycle it could collect 0 space, and the space constantly grows
  • pprof.WriteHeapProfile at the end gives me a profile which shows that there's some inuse_space related to TeeReader from requests, though acquired by persistConn inside of http.Transport. It's not several gigabytes, but up to several hundred megabytes.
  • I tried to use Client.SetTransport() providing my http.Transport with reduced IdleConnTimeout, MaxIdleConns, DisableKeepAlive and other similar things like explicit call . Nothing changed.
  • Finally, I reverted gowebdav to revision prior to the fix for issue #14 - no additional memory has been taken (memory footprint was just about several megabytes according to gctrace)

I believe there's something wrong with current usage of TeeReader introduced in #b45378c. It looks like persistConn inside of a Transport somehow doesn't allow to collect the unused objects (i.e even if a new file gets uploaded, the data for the old one doesn't get cleared). Unfortunately I'm pretty new to Go and couldn't resolve the issue by myself.

Expected Well, no memleak

Additional context Just in case you're curious how do I use gowebdav exactly, here is the tool I'm using.

Please let me know if you need any additional info.

ilyalavrinov avatar Sep 21 '18 07:09 ilyalavrinov

thanks for reporting. i'll take a look.

chripo avatar Sep 24 '18 07:09 chripo

i took a look inside your code and ours. everything seems to be fine, or i can't spot the error atm.

we don't use a global state or caching or something else, therefore GO should collect all memory. and it does it, but in it's on own way.

if some memory gets collected it's still available to the the application, so the application can use it again without acquire more from the system, which is much cheaper.

you can force it by calling FreeOsMemory

chripo avatar Sep 27 '18 09:09 chripo

Thanks! I'll take a look at "runtime/debug" to dig deeper into this issue and probably collect some more data. However it still seems a bit strange that:

  • in revision b45378c of gowebdav I do not observe so huge memory usage (with the same version of my app)
  • the GC does not give back the memory to the system even if I tried to clear idle connections and other background entities.

Anyway, I'll play a bit with tools from "runtime/debug", double-check that the issue is present in the latest version of gowebdav and maybe narrow down the issue with some simple example + test config for local nginx/apache. I'll come back in a few days if I get any results.

ilyalavrinov avatar Sep 28 '18 07:09 ilyalavrinov

Yes the memory consumption should be doubled since #14 because of teeing the body, which sucks badly. We still need to refactor this case and do some strategy for servers which prefere the auth in this way, see #15.

defere bb.Reset() may speed up GC.

chripo avatar Sep 28 '18 07:09 chripo

@admirallarimda did you took a deeper look?

chripo avatar Oct 21 '18 08:10 chripo

Unfortunately, not yet. I'll definitely try it out some day, but currently I cannot predict when I would have time for this. If you think everything looks fine, feel free to close the issue. Once I dig deeper into the issue and prove that there's a real problem, I'll reopen it or send a PR if I manage to fix it somehow.

ilyalavrinov avatar Oct 22 '18 07:10 ilyalavrinov

https://github.com/studio-b12/gowebdav/blob/425530b55ef4712a38177d133693c7601824c8dd/requests.go#L16

TeeReader requires as much memory as the requested file size. This code needs to be improved to avoid out of memory when uploading large-size files.

ochanism avatar Oct 22 '18 10:10 ochanism

I also found the memory leak reported in this issue. As a solution, after this line: https://github.com/studio-b12/gowebdav/blob/321978fa735d5f8c3eaeca2c705a1ada7d34abbe/requests.go#L45 I've read the body of the response as it wasn't collected by GC. This would resolve the memory leak and also a not reported goroutine leak.

_, err = ioutil.ReadAll(rs.Body)
if err != nil {
	return nil, err
}

fghezzo avatar Jan 21 '20 13:01 fghezzo