buildkit
buildkit copied to clipboard
Very high memory usage (OOM) for complex builds that leverage caching
We've been running into extremely high memory usage for complex builds that make heavy use of cached results.
Specifically, these builds are invoked using Earthly. Here's one of the builds in question: https://github.com/earthly/earthly/blob/main/tests/Earthfile#L17
And here is an example of memory usage during one of the problematic builds. As you can see BuildKit is chewing up 35GB+ of memory.
After some digging and pprof exploration, I identified a few areas in the code that could be improved with regard to memory usage.
Here are some sites in the code that are consuming a lot of memory:
(pprof) top10
Showing nodes accounting for 9.60GB, 90.17% of 10.64GB total
Dropped 387 nodes (cum <= 0.05GB)
Showing top 10 nodes out of 81
flat flat% sum% cum cum%
3.72GB 34.97% 34.97% 7.14GB 67.07% github.com/moby/buildkit/solver.(*cacheManager).newKeyWithID (inline)
3.42GB 32.10% 67.07% 3.42GB 32.10% github.com/moby/buildkit/solver.newKey (inline)
1.13GB 10.59% 77.66% 1.13GB 10.64% github.com/moby/buildkit/solver.(*edge).probeCache
(output truncated)
Here are the first & second item locations:
(pprof) list newKeyWithID
Total: 10.64GB
ROUTINE ======================== github.com/moby/buildkit/solver.(*cacheManager).newKeyWithID in solver/cachemanager.go
3.72GB 7.14GB (flat, cum) 67.07% of Total
. . 344:func (c *cacheManager) newKeyWithID(id string, dgst digest.Digest, output Index) *CacheKey {
. 3.42GB 345: k := newKey()
. . 346: k.digest = dgst
. . 347: k.output = output
. . 348: k.ID = id
3.72GB 3.72GB 349: k.ids[c] = id
. . 350: return k
. . 351:}
. . 352:
And here is the 3rd:
(pprof) list probeCache
Total: 10.64GB
ROUTINE ======================== github.com/moby/buildkit/solver.(*edge).probeCache in solver/edge.go
1.13GB 1.13GB (flat, cum) 10.64% of Total
. . 198:func (e *edge) probeCache(d *dep, depKeys []CacheKeyWithSelector) bool {
. . 199: if len(depKeys) == 0 {
. . 200: return false
. . 201: }
. . 202: if e.op.IgnoreCache() {
. . 203: return false
. . 204: }
. 5.49MB 205: keys, err := e.op.Cache().Query(depKeys, d.index, e.cacheMap.Digest, e.edge.Index)
. . 206: if err != nil {
. . 207: e.err = errors.Wrap(err, "error on cache query")
. . 208: }
. . 209: found := false
. . 210: for _, k := range keys {
. . 211: k.vtx = e.edge.Vertex.Digest()
. . 212: if _, ok := d.keyMap[k.ID]; !ok {
1.13GB 1.13GB 213: d.keyMap[k.ID] = k
. . 214: found = true
. . 215: }
. . 216: }
. . 217: return found
I noticed that newKeyWithID is creating a new struct with digest and ID strings. This method is being called 100s of 1000s (in some cases, millions) of times during our large builds. Each digest and ID pair amount to about 100 bytes. As an experiment, I modified the code such that keys are cached and reused rather than recreated for every invocation. Here is the draft PR: https://github.com/moby/buildkit/pull/4447
With the above patch, you can see that memory consumption has been reduced to the extent that the newKeyWithID and newKey methods no longer appear in the top 10 items. The change did not lead to any other unexpected failures and our tests were able to pass without OOM issues.
The PR also contains a rough implementation of key pruning, as these values will have to be cleaned up periodically. Admittedly, there is a trade-off between memory use and code complexity.
probeCache is still using a lot of memory to store values in keyMap, but perhaps that can be looked at separately.
Showing top 10 nodes out of 131
flat flat% sum% cum cum%
1115.54MB 39.33% 39.33% 1115.54MB 39.33% github.com/moby/buildkit/solver.(*edge).probeCache
136.56MB 4.82% 44.15% 137.56MB 4.85% github.com/moby/buildkit/solver/pb.(*OpMetadata).Unmarshal
110.86MB 3.91% 48.06% 265.97MB 9.38% github.com/moby/buildkit/solver/pb.(*Definition).Unmarshal
(output truncated)
Before the patch (left) vs. after the patch (right):
My question, ultimately, is: does the code need to create new structs for each newKeyWithID invocation or can they be reused?
Another suggestion would be to store digests as pointers to strings rather than strings. This could also make the code less memory-hungry.