Batch mode for better performance on large codebases
When working on a large codebase which uses mockgen heavily (in my case ~90 packages call mockgen) a lot of processing time is wasted on repeatedly parsing the package code. To improve this, I propose to implement batch mode, which would take multiple packages and generate mocks for all of them. This will allow us to parse code only once, significantly improving performance.
The easiest way to achieve this in practice would be to accept some configuration file, e.g. YAML. I implemented it already to test the performance impact, see:
- https://github.com/uber-go/mock/pull/244
I can change the config file format etc. as needed.
On my codebase, generating mocks by making individual mockgen calls for every package:
$ /usr/bin/time ./mockgen.sh # each line of the script is a mockgen invocation
74.72 real 346.09 user 147.04 sys
and using batch mode:
$ /usr/bin/time mockgen -batch mockgen.yaml
7.89 real 26.55 user 17.26 sys
so it's a 10x reduction in execution time.
Note that I didn't parallelize the the calls to generateTarget. I experimented with a worker pool, but saw barely any improvement in performance. The majority of time is spent parsing package code. I can add back the worker pool implementation to the PR if desired.
Here it is for reference:
jobs := make(chan *genTarget)
var wg sync.WaitGroup
for range runtime.GOMAXPROCS(0) {
wg.Add(1)
go func() {
defer wg.Done()
for job := range jobs {
job.err = generateTarget(job)
}
}()
}
for i := range targets {
jobs <- &targets[i]
}
close(jobs)
wg.Wait()
failed := false
for i := range targets {
if err := targets[i].err; err != nil {
failed = true
msg := err.Error()
log.Print(strings.ToUpper(msg[:1]) + msg[1:])
}
}
if failed {
os.Exit(1)
}
(In this version generateTarget returns an error instead of calling log.Fatalf, so that we don't kill other goroutines and end up with semi-generated garbage files.)
I want to offer an alternative solution. What if mockgen provided a mockery-compatible template that could be used directly with mockery? I understand this is kind of a radical shift for this project, but I have designed mockery to support projects such as mockgen. The lesson you learned with regards to loading all packages at once will have to be learned for every project that parses syntax. Why not use the mockery code generation framework to take advantage of these performance improvements (and more that will come in the future) instead of reimplementing it yourself? You'll still get mockgen mocks, but the syntax parsing will shift to mockery.
PS I assume you read my blog post about mockery v3. If not, the timing of this GH issue is extremely coincidental 😄
Thanks for the suggestion. I haven't seen your blogpost, but I'll check it out. If mockery can generate mockgen-compatible mocks, I'll give it a shot.
It should be able to, but you'd have to port all of the g.p("string") logic here into a proper Go text/template. I had to do this for the mockery project and it took some time but it worked in the end. The same should be possible here.
@LandonTClipp It took a bit of work, but I got working templates for gomock :). I thought it'd be great if they could be available by default in mockery, plus I also needed a few small features, so I created a PR to mockery: https://github.com/vektra/mockery/pull/1030
@sywhang I don't know what gomock maintainers think about having templates here or directly in mockery. In any case, I think using these templates is a better approach than what I did in https://github.com/uber-go/mock/pull/244 (and even faster, if you check the description of my PR to mockery). I'll leave that PR open for now, though, in case maintainers are still interested in having a "native" batch mode, or perhaps would like to take advantage of some code structure improvements I made in that PR.
That's awesome @kszafran! Just another testament to how powerful the mockery framework is.
We will need some input from the Gomock maintainers on how they want to handle this proposal. A performance improvement of 30x as noted in the mockery PR is substantial enough to be worth serious consideration in my opinion.
As the core maintainer of mockery, I'm of course happy to include gomock either directly by hosting it on the mockery repository or remotely through the remote template feature. Both have pros and cons. Which direction we go (including possibly not moving forward with mockery integration at all) depends almost entirely on the perspective of gomock maintainers.
What I don't want to happen is for the mockery project to host gomock templates without a wholesale archival of mockgen. Anything less would result in two independently maintained implementations which is very non-ideal. Hosting templates in uber-go/mock would allow Uber to maintain sole maintainership of the generated code.