More granular parallelized compilation of Solidity contracts
Describe the feature
I've been tasked with exploring the topic of parallelized compilation in solc, especially in the context of improving the speed of compilation via IR. We wanted to know what the obstacles for this kind of parallelized compilation are and if we can do anything to make it easier for frameworks. I'm creating this issue mostly to let you know about my findings. There is a simple way to parallelize compilation in a very granular way, but unfortunately it comes with downsides so I can only suggest providing it as an optional feature.
I know that Hardhat already does parallel compilation in a limited way, by identifying clusters of sources that are interconnected via imports and compling them separately. There's a way to parallelize compilation even within such clusters: take the Standard JSON input containing all the sources and split it into series of inputs where each one uses settings.outputSelection to request output only for a single contract. The compiler will perform compilation and optimization only for the one you selected. It will still analyze all the sources, but the later stages of the pipeline are orders of magnitude slower than analysis so it should not matter that much.
To benchmark it, I created a proof of concept script (parasolc) that can be passed in place of a solc binary to Foundry (forge --use). Here's also the full report with my findings: The parasolc experiment. My comment about such a feature in Foundry also has more detail: https://github.com/foundry-rs/foundry/issues/166#issuecomment-2133290512.
Unfortunately the overhead of doing it this way is very high. The projects I benchmarked require 3-4 times as much work compared to sequential compilation. Still, while expensive, this method does provide an actual improvement in terms of wall-clock-time spent on compilation. It appears that with enough cores you can still come out ahead despite the overhead. While this is far from what I was hoping to present you here, and does not seem like a good choice for the default compilation mode, it's still a trade-off that may make sense in some situations. It may work better for some projects than others, depending on how they are structured and how interdependent their contracts are. The method is simple enough that it might make sense to be an optional feature.
Search terms
solc parallelization parasolc
Thanks @cameel, we are starting to plan out enhancements to our compilation pipeline as part of the next version of Hardhat so this is a really useful analysis. I will make sure the team takes a look and posts questions here.