mold icon indicating copy to clipboard operation
mold copied to clipboard

Add preloading support to CMake and other build commands

Open rui314 opened this issue 4 years ago • 14 comments

mold supports object file preloading feature. In order to use it, someone has to run a linker command with --preload flag a few seconds prior to the actual linker invocation. And I think "someone" should be a build system, such as make or ninja executing CMake-generated ninja or makefiles.

As a starter, I think we should add the following feature to CMake:

  • Detect if a linker is mold
  • run a linker command with --preload if it is mold before invoking compiler instances to generate object files

rui314 avatar Jul 03 '21 16:07 rui314

now I'm using CMake to generate ninja build, and using mold -run ninja to build whole project, mold performance is impressive during my test, I'd like to know if mold can implement

mold --preload -run ninja

it doesn't touch cmake, and mold is known to be running at first.

comicfans avatar Jul 12 '21 04:07 comicfans

@comicfans I think such feature cannot be implemented. In order to do preloading, mold has to know what ninja will do next, but it is generally not predictable.

rui314 avatar Jul 12 '21 04:07 rui314

so if I understand correctly, this feature expected to run mold daemon (with exactly same link arguments) even before all input object compile complete, and for real link, it already preload most input? previously I misunderstood preload as 'single daemon to cache all following invocation'

comicfans avatar Jul 12 '21 07:07 comicfans

Yes, the daemon is for single-use. In order to use the preloading feature, you invoke mold twice instead of once for each linker output. The first one is invoked with --preload to preload object files, and the second one is invoked without --preload to tell the daemon to finish its job.

rui314 avatar Jul 12 '21 07:07 rui314

I opened an issue in CMake's GitLab, see https://gitlab.kitware.com/cmake/cmake/-/issues/23063

gruenich avatar Dec 30 '21 23:12 gruenich

It looks not a hard task. Maybe I can implement it in CMake (if I have time). I noticed that preload code has already been removed from the latest mold, then I can firstly try with some old-version mold. If it succeeds, will you re-introduce the preload feature into master? @rui314

And I have another question after some quick experiments: I got the preload command by manually copying the link command printed by ninja -v and appending -Wl,--preload to it. But I didn't observe any speedup when I then run the command without preload (the generated library is 2.1GB). Is there something I missing? I have tried with mold 1.0 and 1.2.

daquexian avatar Jun 23 '22 05:06 daquexian

I removed --preload because there's no use of it. If it's supported by a build system and proved to be useful, we can discuss resurrecting it.

If you use --preload, you invoke the linker twice with and without --preload. You first invoke the linker with --preload and after a few seconds, run the same command without --preload. If you always run mold with --preload, it doesn't do anything but just preloading files (so no output would be created).

rui314 avatar Jun 23 '22 06:06 rui314

If you use --preload, you invoke the linker twice with and without --preload. You first invoke the linker with --preload and after a few seconds, run the same command without --preload. If you always run mold with --preload, it doesn't do anything but just preloading files (so no output would be created).

That is actually what I did.

Here is my commands (the libraries I want to generate is named liboneflow.so, and the time command output is zsh format):

$ rm liboneflow.so
$ time ./link.sh
./link.sh  0.06s user 0.03s system 2% cpu 3.124 total
$ rm liboneflow.so
$ ./link_preload.sh
# after cpu becomes idle:
$ time ./link.sh
./link.sh  0.05s user 0.05s system 2% cpu 3.294 total

daquexian avatar Jun 23 '22 07:06 daquexian

0.06s is already too fast, so I don't think you can observe any improvements over it. I think it needs to take at least a few seconds to see an improvement.

rui314 avatar Jun 23 '22 07:06 rui314

0.06s is already too fast, so I don't think you can observe any improvements over it. I think it needs to take at least a few seconds to see an improvement.

It is zsh format (which is different with bash) so I think the "total" item (3.124 and 3.294) is corresponding to the "real" item in bash time and is the wall clock time.

daquexian avatar Jun 23 '22 07:06 daquexian

@daquexian You can add --perf to see the breakdown of internal passes. Can you try that option?

rui314 avatar Jun 23 '22 07:06 rui314

What I tried:

$ ./link.sh > no_preload
$ ./link_preload.sh
$ ./link.sh > preload
$ ld --version
mold 1.0.0 (ed9924895d9b9584106791247596677db8113528; compatible with GNU ld and GNU gold)

I found there is still a half-second "read_input_files" in preload.

I uploaded the script I used and also the --perf output. Download link: link.sh link_preload.sh preload no_preload

daquexian avatar Jun 23 '22 08:06 daquexian

I don't know why, but it looks like preloading didn't work at all for your test case. It might be a bug.

rui314 avatar Jun 23 '22 09:06 rui314

I don't know why, but it looks like preloading didn't work at all for your test case. It might be a bug.

I see. I'll try to support preload in cmake (when I have time) anyway.

daquexian avatar Jun 24 '22 06:06 daquexian