go icon indicating copy to clipboard operation
go copied to clipboard

cmd/compile: generics add significant build time and build size overhead

Open xaurx opened this issue 1 year ago • 13 comments

Go version

go 1.20.7 linux/amd64 darwin/arm64

Output of go env in your module/workspace:

GO111MODULE=''
GOARCH='arm64'
GOBIN=''
GOCACHE='/Users/user/Library/Caches/go-build'
GOENV='/Users/user/Library/Application Support/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='arm64'
GOHOSTOS='darwin'
GOINSECURE=''
GOMODCACHE='/Users/user/go/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='darwin'
GOPATH='/Users/user/go'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/opt/homebrew/Cellar/go/1.21.3/libexec'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/opt/homebrew/Cellar/go/1.21.3/libexec/pkg/tool/darwin_arm64'
GOVCS=''
GOVERSION='go1.21.3'
GCCGO='gccgo'
AR='ar'
CC='cc'
CXX='c++'
CGO_ENABLED='1'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -arch arm64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -ffile-prefix-map=/var/folders/03/7b5nrjcj4b31k3mvrqswnxxm0000gn/T/go-build1457705092=/tmp/go-build -gno-record-gcc-switches -fno-common'

What did you do?

We have a large DB-like solution in golang. This project has a fancy b-tree implementation with around 10x B-tree instances working with different key/value structs using interface{}. To save some memory and improve the performance we recently switched to generics, so B-Tree is now a parametrized struct with key/value types.

What did you see happen?

We found the following important issues after switching to generics:

  1. Cold CI build time has increased from 6 to 16 minutes
  2. CLI tool executable size has increased from 48 to 58 MB.
  3. go build cache has increased from 760MB to 4.4GB (after clean build)

All of those seem rather bad and I hope someone would be interested looking into it.

What did you expect to see?

Not so significant build time and go build cache size changes. Executable size is not so important, though is very surprising as well.

xaurx avatar Feb 08 '24 15:02 xaurx

Can you show us your code? Without that, it will be hard for us to look into it.

randall77 avatar Feb 08 '24 15:02 randall77

It's possible to show code privately under NDA. Works for you?

xaurx avatar Feb 08 '24 15:02 xaurx

No, I'm not signing an NDA.

randall77 avatar Feb 08 '24 15:02 randall77

Let me check. Maybe I can collect/provide some info which would be helpful?

xaurx avatar Feb 08 '24 16:02 xaurx

It would be interesting to see the output of go tool nm on the binaries before and after.

randall77 avatar Feb 08 '24 16:02 randall77

ok. Can we send them to email in your github profile?

xaurx avatar Feb 08 '24 16:02 xaurx

Sure.

randall77 avatar Feb 08 '24 16:02 randall77

Likely closely related to #51957 .

randall77 avatar Feb 09 '24 00:02 randall77

@mdempsky This issue would at least be partially fixed by moving away from self-contained object files. Thin object files might help with the cpu, and almost certainly would help with the build cache size. (They have a large package that then gets imported up the dependency tree several steps.)

randall77 avatar Feb 09 '24 01:02 randall77

Like @randall77 mentions, we're aware the compiler's intermediate files are larger than ideal, and unfortunately generics further aggravates that weakness. We have ideas about how to address that and intend to do those, but they're not trivial and will take time to implement and deploy.

You also mention the size changes were due to switching to generics, which means your application code must have changed. That makes interpreting your benchmarks harder. For example, you say your CI now takes almost 3x as long, and suggest that's because generics are slower. But from your description, there's also the possibility that changing the application has simply created 3x more work for the compiler to do.

I expect it's some of both. Certainly the compiler could be faster, but other projects are using generics successfully. For now, I think your best options are:

  1. Experiment with refactoring how you're using generics to reduce the overheads to acceptable levels.
  2. Find examples of generics code where overhead grows superlinearly with source complexity. We try to use O(N) algorithms throughout the compiler, but sometimes we make mistakes and an O(N^2) algorithm sneaks in.

mdempsky avatar Feb 14 '24 22:02 mdempsky

@xaurx could you clarify exactly what "Cold CI build time" means? Is that the time it takes merely to run go build on your project on a fresh worker?

cespare avatar Feb 14 '24 22:02 cespare

@cespare yes, it's a fresh worker using the same code cache, i.e. not downloading it again. To reproduce locally I used go clean -cache between the runs:

w/o generics:

$ go clean -cache
$ time make
  129.74s user 23.57s system 400% cpu 38.242 total
$ du -sh go-build
760M    /Users/user/Library/Caches/go-build
$ go tool nm tool.before| wc -l
73861

w/ generics:

$ go clean -cache
$ time make
  817.00s user 89.57s system 558% cpu 2:42.33 total
$ du -sh go-build
4.4G    /Users/user/Library/Caches/go-build
$ go tool nm tool.after | wc -l
80238

as you can see, after adding generics compile time becomes 4.2x slower and takes 5.7x more disk space in go-build cache.

xaurx avatar Feb 15 '24 09:02 xaurx

@mdempsky see exact commands I used to reproduce in above message. i.e. it's not about some other changes in the code. any fresh build after go clean -cache takes 4.2x more time.

Please note, any package which imports the one used generics now takes so much significantly more time to build. Since we import it a lot of times - it adds up and overall build takes long. Additionally, I tried to compile with GC_FLAGS="-S" and generics generate 15.5x more output and much more symbols. So my guess is that it was optimized out by linker and nm doesn't reveal a huge difference, but compiler spent so much time...

xaurx avatar Feb 15 '24 09:02 xaurx