Default compile takes 62 minutes, no knobs to exclude protobof-lite or support for unneeded languages from protoc?
Hi!
It has come to my attention that a default compile (meaning plain cmake -S . -B build) of Protobuf (commit 24830cf07e559e912fa2622d3487a679ef0df9a9 of 2025-02-28 and GCC 13) takes 62 minutes on a Lenovo ThinkPad X220, and that much of that time could be saved if only the build system would allow to (1) compile protobuf compiler "protoc" with fewer languages included (e.g. when all I need is C++ and none of C#/Java/JKotlin/ObjectiveC/PHP/Python/Ruby/Rust) and (2) exclude protobuf-lite. 62 minutes!
This was in context of libprotobuf-mutator where switch -DLIB_PROTO_MUTATOR_DOWNLOAD_PROTOBUF=ON would download and compile an almost complete Protobuf originally e.g. because Protobuf in many mainline distros is so far behind Protobuf upstream, including Ubuntu (20.04/22.04/)24.04. I should note that currently 14 projects in OSS-Fuzz are doing that because they had no choice: compile almost all of Protobuf without genuine need to.
For a case where (nothing but) libprotobuf and a protoc compiler with C++ support are needed (as with libprotobuf-mutator), the only two build switches reducing compile time I found were -Dprotobuf_BUILD_TESTS:BOOL=OFF -Dprotobuf_BUILD_LIBUPB:BOOL=OFF but they break the build (see #20538) and are not cutting of much. I was demoing a quick-hack to reduce compile time in https://github.com/google/libprotobuf-mutator/pull/267/files earlier but it was not the right place, not in upstreamable form, not a true fix.
Is reducing the build to fewer unneeded parts at the source something that your are considering?
Best, Sebastian
CC @vitalybuka
@protocolbuffers any thoughts?
Closing for no reply…
@hartwork Can you provide more info about this? Over an hour seems very long for this build even on the hardware you're talking about.
Note that the cmake build doesn't build other language's runtimes, it only builds the code generators for each. The language generators really are pretty small and so is libupb, so likely there shouldn't be too much benefit from trimming them from your build.
In terms of 'lite', the Full C++Proto runtime deps onto the Lite C++Proto runtime, and protoc uses the Full C++ runtime, so work compiling LiteC++ is not duplicative. It may be theres duplicative linking or something there?
If you have any info on where your build times are being spent we may be able to help advise.
You might also try adding --parallel 8 or using ccache/sccache for incremental builds. Both of those can speed up cmake build dramatically.
@hartwork Can you provide more info about this? Over an hour seems very long for this build even on the hardware you're talking about.
@esrauchg I am using the same hardware just now, but when trying to get you a build duration time for today, the build fails already at CMake stage, new issue #24205, both on Debian sid and Gentoo.
Note that the cmake build doesn't build other language's runtimes, it only builds the code generators for each. The language generators really are pretty small and so is libupb, so likely there shouldn't be too much benefit from trimming them from your build.
With my patching at https://github.com/google/libprotobuf-mutator/pull/267 I was literally cutting build time in half back then.
In terms of 'lite', the Full C++Proto runtime deps onto the Lite C++Proto runtime, and protoc uses the Full C++ runtime, so work compiling LiteC++ is not duplicative. It may be theres duplicative linking or something there?
If you have any info on where your build times are being spent we may be able to help advise.
The 62 minutes are for a full build of the default build configuration. The patches there show what I was patching away. I'm not sure how you would call it.
You might also try adding
--parallel 8or using ccache/sccache for incremental builds. Both of those can speed up cmake build dramatically.
@mkruskal-google maybe, but that's only working around the core issue: that I'm building more than I need. I would give you a time with -j$(nproc) but then issue #24205 currently keeps me from successfully building anything for main or v33.
Update: I learned from #22209 about -Dprotobuf_BUILD_TESTS=OFF now to workaround the build error and would like to offer these four data points from the reference hardware for the same original commit:
- 24830cf07e559e912fa2622d3487a679ef0df9a9 +
-Dprotobuf_BUILD_TESTS=OFF+ Debian sid + GCC 13 + Abseil 20240722.0 = 21m49.280s - 24830cf07e559e912fa2622d3487a679ef0df9a9 +
-Dprotobuf_BUILD_TESTS=OFF+ Debian sid + GCC 15 + Abseil 20240722.0 = 23m40.501s - 24830cf07e559e912fa2622d3487a679ef0df9a9 +
-Dprotobuf_BUILD_TESTS=OFF+ Gentoo + GCC 14 + Abseil 20250512.1 = 33m28.106s - 24830cf07e559e912fa2622d3487a679ef0df9a9 +
-Dprotobuf_BUILD_TESTS=OFF+ Gentoo + GCC 15 + Abseil 20250512.1 = 30m41.150s
The version of Abseil is likely different to back in March, guaranteed different in the Gentoo case, the linker could be a different one also. I'm not sure what else could be different, how much build time went to the tests (if I built them back then) or why the build would be twice as fast as the original 62 minutes now. But then 33 minutes is still quite a bit.
https://github.com/protocolbuffers/protobuf/issues/24205 is because we build tests by default and whatever version of Abseil you have installed didn't install test utils. I think disabling tests by default could be a reasonable change as they are extremely expensive to build (as you've pointed out with data). But it would be fairly dangerous on our end as we'd have to be extremely careful to explicitly enable them in all our test builds to continue testing with cmake in our CI
#24205 is because we build tests by default and whatever version of Abseil you have installed didn't install test utils.
@mkruskal-google that's all fair, but a dedicated error message and detecting that case would be nice. Currently the user ends up in low level error soup where nothing but Google will be of much help.
I think disabling tests by default could be a reasonable change as they are extremely expensive to build (as you've pointed out with data). But it would be fairly dangerous on our end as we'd have to be extremely careful to explicitly enable them in all our test builds to continue testing with cmake in our CI.
I believe "make test" should fail right away if enable_testing has not been called, so that should make the change safe in the sense that it would fail make test if tests have been forgotten to enable before end thus would alert about need for adjustment in that CI. To demo that very case:
# cd "$(mktemp -d)"
# echo $'cmake_minimum_required(VERSION 3.10)\nproject(demo123)' | tee CMakeLists.txt
cmake_minimum_required(VERSION 3.10)
project(demo123)
# cmake . && make test ; echo $?
-- The C compiler identification is GNU 14.3.1
-- [..]
-- Build files have been written to: /tmp/tmp.IjQWJsNGip
make: *** No rule to make target 'test'. Stop.
2
@hartwork Can you provide more info about this?
@esrauchg I now compiled with (simplified) /usr/bin/time gcc — GNU Time — instead of plain gcc to give you seconds runtimes for all the invocations of gcc (and g++) made. Here's how I did that:
cd "$(mktemp -d)"
git init
git fetch --depth 1 https://github.com/protocolbuffers/protobuf 24830cf07e559e912fa2622d3487a679ef0df9a9
git checkout FETCH_HEAD
echo $'#!/usr/bin/env bash\nexec /usr/bin/time -f "%e %C" -a -o "$(dirname "$0")"/time.log g++-14 "$@"' > time-g++.sh
echo $'#!/usr/bin/env bash\nexec /usr/bin/time -f "%e %C" -a -o "$(dirname "$0")"/time.log gcc-14 "$@"' > time-gcc.sh
chmod a+x time-gcc.sh time-g++.sh
time sh -c 'cmake -DCMAKE_C_COMPILER="${PWD}/time-gcc.sh" -DCMAKE_CXX_COMPILER="${PWD}/time-g++.sh" -Dprotobuf_BUILD_TESTS=OFF -S . -B build && make -C build VERBOSE=1' ; echo $?
The resulting time.log is this:
Download: time.log
The top 10 longest compile time files (for commit 24830cf07e559e912fa2622d3487a679ef0df9a9) are:
# sort -g -r < time.log | head | sed "s|${PWD}|.|g" | awk '{print $1 " " $NF}'
20.48 ./src/google/protobuf/descriptor.cc
19.88 ./src/google/protobuf/descriptor.cc
19.87 ./src/google/protobuf/descriptor.cc
16.80 ./src/google/protobuf/compiler/cpp/message.cc
14.20 ./src/google/protobuf/compiler/cpp/file.cc
13.23 ./src/google/protobuf/text_format.cc
11.84 ./src/google/protobuf/compiler/cpp/helpers.cc
11.79 ./src/google/protobuf/compiler/command_line_interface.cc
10.23 ./src/google/protobuf/util/message_differencer.cc
9.98 ./src/google/protobuf/generated_message_tctable_lite.cc
PS: It seems like file src/google/protobuf/descriptor.cc was indeed compiled 3 times.
[@mkruskal-google
](https://github.com/mkruskal-google) that's all fair, but a dedicated error message and detecting that case would be nice. Currently the user ends up in low level error soup where nothing but Google will be of much help.
I believe "make test" should fail right away if
enable_testinghas not been called, so that should make the change safe in the sense that it would failmake testif tests have been forgotten to enable before end thus would alert about need for adjustment in that CI. To demo that very case:
I think this is a valid point and worth a separate issue. I wasn't aware that make test did the sensible thing, so that does make me feel a lot better about it. I think we'd be open to a PR for this as well, as we usually have trouble prioritizing CMake improvements.
PS: It seems like file src/google/protobuf/descriptor.cc was indeed compiled 3 times.
Those results are extremely surprising. It looks like protoc-gen-upbdefs and protoc-gen-upb are rebuilding descriptor.cc from scratch. I would hope cmake reuses the results of previous builds and only actually runs the compiler over it once though. My expectation is that most of the duplicated work would come from linking not from compiling. Those plugins also both have dependencies on libprotobuf, so this seems completely incorrect and like it could lead to ODR violations...
I'm also not able to reproduce that... When I build with default flags from main I only see a single descriptor.cc.o produced for libprotobuf.
It looks like whatever cmake issue you've found has been fixed since https://github.com/protocolbuffers/protobuf/commit/24830cf07e559e912fa2622d3487a679ef0df9a9. I vaguely remember fixing this a few months ago by adding a dependency to libprotobuf instead of rebuilding everything in these generators.
I think this is a valid point and worth a separate issue. I wasn't aware that
make testdid the sensible thing, so that does make me feel a lot better about it. I think we'd be open to a PR for this as well, as we usually have trouble prioritizing CMake improvements.
@mkruskal-google I'm reading that as you would welcome a pull request that flips the default. I created #24373 now but the CI is very unfriendly to contributions from (and debugging in) forks, I'm not sure I'll be able to get this to a green state.
I'm also not able to reproduce that... When I build with default flags from main I only see a single
descriptor.cc.oproduced for libprotobuf.It looks like whatever cmake issue you've found has been fixed since 24830cf. I vaguely remember fixing this a few months ago by adding a dependency to libprotobuf instead of rebuilding everything in these generators.
@mkruskal-google I confirm that for main (at c657b59bf68529e88d7f38200f14c57df27cfbed) file src/google/protobuf/descriptor.cc no longer shows up in time.log three times 👍
PS: Here's my time.log for main if of interest. It's GCC 14 + Abseil 20250512.1 on Gentoo.
[@mkruskal-google
](https://github.com/mkruskal-google) I'm reading that as you would welcome a pull request that flips the default. I created #24373 now but the CI is very unfriendly to contributions from (and debugging in) forks, I'm not sure I'll be able to get this to a green state.
Yea, while it's a lot better than it was a few years ago, I agree this workflow isn't ideal. Specifically it's not possible to test changes to our workflow files from forks due to complications between github limitations and some of our internal constraints. I should be able to take over that PR and tweak it a bit, it's a good starting point
PS: Here's my time.log for
mainif of interest. It's GCC 14 + Abseil 20250512.1 on Gentoo.
Are there any other obvious bottlenecks there? It looks like we are doing duplicate work for libprotobuf-lite, which is surprising but probably not that expensive. We could probably turn that into a proper dependency like it is in Bazel...
Are there any other obvious bottlenecks there?
@mkruskal-google users could literally save 45% compile time — about 12 minutes on the hardware above — by not compiling language support for non-C++ languages in. For proof:
# grep -o "compiler/[^/ ]\+/" time.log | sort -u | sed -r 's,^compiler/(.+)/$,\1,' | grep -v cpp | tr '\n' '|' | sed -e 's,^,/(,' -e 's,|$,)/,' # list other languages
/(csharp|java|kotlin|objectivec|php|python|ruby|rust)/
# awk '{print $1}' time.log | sed 's,^,+ ,' | xargs echo 0 | bc # total time
1592.13
# grep -E '/(csharp|java|kotlin|objectivec|php|python|ruby|rust)/' time.log | awk '{print $1}' | sed 's,^,+ ,' | xargs echo 0 | tee /dev/stderr | bc # compile time on non-C++ languages
0 + 5.75 + 7.11 + 7.75 + 9.02 + 5.74 + 7.19 + 6.63 + 7.13 + 7.17 + 9.06 + 7.01 + 7.10 + 6.60 + 6.08 + 6.52 + 6.72 + 4.98 + 6.55 + 6.33 + 7.06 + 7.85 + 6.99 + 7.77 + 7.21 + 6.17 + 6.39 + 8.07 + 8.18 + 7.43 + 6.97 + 10.26 + 10.54 + 12.95 + 7.68 + 7.16 + 7.87 + 5.16 + 7.15 + 7.64 + 7.26 + 6.12 + 6.00 + 7.48 + 8.06 + 7.05 + 6.85 + 7.70 + 7.16 + 6.01 + 6.00 + 6.07 + 6.79 + 8.51 + 6.76 + 6.50 + 7.39 + 6.45 + 7.45 + 6.38 + 7.77 + 8.94 + 7.05 + 6.77 + 5.24 + 2.03 + 6.72 + 8.79 + 6.69 + 6.84 + 7.27 + 6.89 + 2.65 + 5.19 + 10.83 + 10.10 + 6.68 + 9.52 + 9.29 + 0.60 + 7.65 + 7.46 + 7.31 + 7.17 + 7.20 + 7.20 + 7.21 + 8.90 + 6.63 + 8.24 + 6.82 + 8.69 + 9.47 + 10.87 + 9.27 + 7.47 + 8.50 + 2.14 + 5.77 + 4.89 + 7.43
717.08
That's why I put "exclude [..] support for unneeded languages" in the title. Why compile code that you don't need at runtime later 😃
That is surprising, I'm suspicious there's not something else going on there... Our language generators are not that complex and I would expect the C++ runtime to dominate the build.
But anyway, our official "protoc binary" includes all of these generators, and is what we release for every language. So that's definitely going to be the default build. In Bazel we do also have a concept of protoc_minimal, but that doesn't include the C++ generator either so I'm not sure how useful it would be. If we were to extend that to cmake we'd have to also include configs for building all the generators as plugins, which is not something we're ready to support.
It does sound like your build time is already down from 62 minutes to ~27 minutes. I'm not convinced the added complexity of stripping out all the non-C++ generators and supporting a "C++-only protoc" is worth shaving that down further.
Let me add two things for completeness:
- My personal hammer of choice for cutting build time down for this would be the C++ pre-processor (not plugins), if it's just about cutting things away (rather than offering extensibility). It would allow making the cuts I did at https://github.com/google/libprotobuf-mutator/pull/267/files conditional.
- I'll see to get you a compile time number with tests from the same machine to see how far it's come from 62 minutes with tests. That seems like an open question still. May take me a day or two.
The sum of the language-plugins that you mentioned really should be a very small portion of the compilation, because the sum of them are both small amount of .cc code and should also be more straightforward-to-compile code than the C++Proto runtime is.
The Ruby generator looks to be only a single .cc file that is 340 lines long (here), its pretty straightforward code, with no deps beyond the C++Proto runtime.
If you're still iterating on the bug report here, it feels like there's something there, your past grep shows an awful lot of ~10s times; I think its already suspiciously slow if any one of the individual generators you mentioned would taking ~1 minute in isolation even on a decade old laptop (or alternatively, a suspiciously high ratio compared to your "only C++" case, since it shouldn't be close to doubling the compile workload to include them)
@esrauchg I don't know the code, but the numbers do not feel off to me, and I'm used to compilation being with Gentoo. Maybe the symptom does not transport to modern hardware in a linear factor way. Any ideas for what we could try to get to the bottom of the "is this mysterious or normal" question?
My intuition is only because of the magnitude of the code and nothing about hardware expectations: I did just double check and I do think that just by raw volume of code the sum of all of those language generators should be only maybe 15 or maybe 20% of the total compilation units / amount of source to compile when building protoc (without going deep on checking the expanded headers and everything).
This is only abstract first-principles and clearly observed reality reality is different, but I would have expected that 15% of the compilation units which are doing less complicated stuff (a lot less template expansions, etc), should save only more like 10% of your time to remove those language generators from the build instead of 45%.
Your grep above for all the languages shows a long list of numbers, is really most of that just Java or just Rust generators?
Your grep above for all the languages shows a long list of numbers, is really most of that just Java or just Rust generators?
Java and Rust combined seems to be >50% (i.e. more than ~360 seconds of 717.08 seconds) of the non-C++ time:
# grep -E '/(java|rust)/' time.log | awk '{print $1}' | sed 's,^,+ ,' | xargs echo 0 | tee /dev/stderr | bc
0 + 6.55 + 6.33 + 7.06 + 7.85 + 6.99 + 7.77 + 7.21 + 6.17 + 6.39 + 8.07 + 8.18 + 7.43 + 6.97 + 10.26 + 10.54 + 12.95 + 7.68 + 7.16 + 7.87 + 5.16 + 7.15 + 7.64 + 7.26 + 6.12 + 6.00 + 7.48 + 8.06 + 7.05 + 6.85 + 7.70 + 7.16 + 6.01 + 6.00 + 6.07 + 6.79 + 0.60 + 7.65 + 7.46 + 7.31 + 7.17 + 7.20 + 7.20 + 7.21 + 8.90 + 6.63 + 8.24 + 6.82 + 8.69 + 9.47 + 10.87 + 9.27 + 7.47 + 8.50 + 2.14 + 5.77 + 4.89 + 7.43
414.82
Is that what you were wondering about?
PS: For recent numbers with test compilation included, I have:
- 44 minutes on Lenovo ThinkPad X220 (2012)
- 14 minutes on Lenovo ThinkPad T14 Gen 5 (2025)
(both without parallelization, effectively -j1)
My reproducer was this Dockerfile:
FROM debian:trixie
RUN apt-get update \
&& \
apt-get install --no-install-recommends -y -V \
build-essential \
ca-certificates \
cmake \
git \
libgmock-dev \
libgtest-dev \
make \
time \
wget
RUN ABSEIL_VERSION=20240722.0 \
&& \
wget https://github.com/abseil/abseil-cpp/releases/download/${ABSEIL_VERSION}/abseil-cpp-${ABSEIL_VERSION}.tar.gz \
&& \
tar xf abseil-cpp-${ABSEIL_VERSION}.tar.gz \
&& \
cd abseil-cpp-${ABSEIL_VERSION} \
&& \
cmake -DABSL_BUILD_TEST_HELPERS=ON -DABSL_USE_EXTERNAL_GOOGLETEST=ON . \
&& \
make -j$(nproc) \
&& \
make install
RUN git -c init.defaultBranch=main init \
&& \
git fetch --depth 1 https://github.com/protocolbuffers/protobuf 24830cf07e559e912fa2622d3487a679ef0df9a9 \
&& \
git -c advice.detachedHead=false checkout FETCH_HEAD
RUN time sh -c 'cmake -Dprotobuf_BUILD_TESTS=ON -S . -B build && make -C build' ; false # to display time output
An option that would solve this situation is using Unity builds. We made some improvements that should help you. That's the best we can do here; we won't be able to provide any further support on this one. Thanks for sending us some great PRs in helping us to improve Protobuf!