Nim 2.2 requires 50% more time to compile a large project than Nim 2.0
Nim Version
v2.0.14 and https://github.com/nim-lang/Nim/commit/1227799b849f5387273ae8d6fac0129a3d64af1b
Description
Using https://github.com/status-im/nimbus-eth2/ as the reference repository:
Nim v2.0.14:
- `./env.sh nim check beacon_chain/nimbus_beacon_node`: `330805 lines; 87.857s; 8.488GiB peakmem`
- `./env.sh nim check tests/all_tests`: `375037 lines; 109.009s; 8.591GiB peakmem`
`upstream/version-2-2` (as of today, `1227799b849f5387273ae8d6fac0129a3d64af1b`):
- `./env.sh nim check beacon_chain/nimbus_beacon_node`: `333000 lines; 116.915s; 8.44GiB peakmem`
- `./env.sh nim check tests/all_tests`: `377232 lines; 153.576s; 8.549GiB peakmem`
Two main issues:
- between 2.0.14 and
version-2-2, it got 50% slower; and - it's consuming 8.5GiB RAM to compile, on both.
The former is a specific regression, while the latter is an ongoing issue that appears to be less tied to a specific Nim version.
It would be useful to understand to what extent the performance decrease is required, as well as to reduce peak memory usage of the Nim compiler.
Current Output
Expected Output
Known Workarounds
No response
Additional Information
No response
Those are 30% and 40% but still pretty bad. At a couple points the compiler bootstrap time also jumped up. According to github CI the version-2-0 compiler takes 7.2 seconds and the version-2-2 compiler takes 9 (also evidenced by the difference with the csources compiler). So the compiler itself might be a useful target to profile the compiler.
Edit:
One jump seems to be in #23403, looking at the bootstrap time of commits before and after it (in the PR comments).
There was time, when Nim Compiler was mangled types and identifiers with simple numbers, right now only unique identifiers are mangled with simple numbers, but type identifiers got mangling like this
tyObject_FooType__IqAutnmqt7v7Ut3uVgNLiQ
tyArray__nHXaesL0DJZHyVS07ARPRA
tyObject_IOError__cIQJ2XNApXjB0aIzNpQIlg
tyObject_FieldDefect__sjzGe6i9adMIOnQ1sLKo2Dw
tyObject_IndexDefect__LxjeuNDg5u7gjxtJKWuqew
I think refactoring code which actually will use numeric _1 or _2 instead of 23 characters save a lot of memory, also i think generation of 1, 2 is much faster than 23 random characters, probably cryptographically hashed.
One more note is about this function names, generated by Nim compiler:
atmnimbusminuseth2atsvendoratsnimbusminusbuildminussystematsvendoratsNimatslibatsstdatssynciodotnim_DatInit000
atmnimbusminuseth2atsvendoratsnimbusminusbuildminussystematsvendoratsNimatslibatssystemdotnim_Init000
One data point about the memory usage growth is from https://github.com/status-im/nimbus-eth2/pull/4513 which as of January 2023 notes:
# Windows GitHub Actions CI runners, as of this writing, have around 8GB of RAM
# and compiling all_tests requires around 5.5GB of that. It often fails via OOM
# on the two cores available. Usefully, the part of the process requiring those
# gigabytes of RAM is `nim c --compileOnly`, which intrinsically serializes. As
# a result, only slightly increase build times by using fake dependencies, when
# running `make test`, to ensure the `all_tests` target builds alone when being
# built as part of `test`, while not also spuriously otherwise depending on the
# not-actually-related Makefile goals.
#
# This works because `nim c --compileOnly` is fast but RAM-heavy, while the
# rest of the build process, such as LTO, requires less RAM but is slow and
# still is parallelized.
memory usage has almost doubled since then, and while the nimbus-eth2 codebase has grown, it hasn't doubled in size or complexity.
The 2.0+ compiler is bootstrapped with --mm:orc and before is --mm:refc, maybe it's a consequence of that
Edit: Tested in #24955, only drops max usage to around 7.2 GB for both
@Araq closed this as completed yesterday
The speed was restored with https://github.com/nim-lang/Nim/pull/24930, but the memory consumption is still the problem.
IMO, this either needs to be reopened or we need a separate issue for addressing the memory consumption.