boulder-d-legacy
boulder-d-legacy copied to clipboard
Inefficiencies in the build process
Spent a couple of hours mapping out the build process and little tweaks that can help a lot to the overall experience and performance.
Create root
Not much value in making it parallel and just ensure all the directories are available for next steps.
- [ ] Allow mounting directories as
tmpfs. Many writes saved and blitting/caching is a lot faster! Easy 1-2s on builds with many deps.
Fetching upstreams and packages
The ideal would be to fetch index and start fetching upstreams while calculating deps and packages to also fetch. A lot of hassle really (and we go back to that moss-jobs style overhead) if we simply make the index fetch and dep calc fast we can get most gains even with the current approach and the following modifications.
This mixes up moss and boulder work queues so would need a refactor if it's worth implementing at all.
- [ ] Fetch upstreams with the packages! We can add them together to the same work queue without any problem really. The benefit of current approach is early exit if upstream doesn't validate, but would assume you would fix and try again therefore still want the packages fetched.
- [ ] While adding upstreams to the queue (no benefit of this otherwise), we can also create a function to extract, (or setup) the upstream in the build root. Fetchable is pretty awesome really! There is also having to deal with the PGO build and having to recreate the upstreams during the build (that seems easy enough to reuse the same functions to setup the builddir extractions).
This is only within moss
- [ ] Cache stones that have already been downloaded in the work queue. Currently it sits waiting for all packages to fetch first and only caches downloaded packages. See https://github.com/serpent-os/moss/issues/23#issuecomment-1340703634 for an easy strategy for adding them to the same queue. This will save seconds when you only have a few packages to fetch.
- [ ] Potential to add blitting to the workqueue (i.e. cache then blit in the function). Currently it's done later, one at a time. Shouldn't be too parallel as only caching one package at a time.
Build stages
Most areas to speed up are via clang and options so won't be explored here. Still a couple of options.
- [ ] Make
fakerootoptional. Two options, makefakerootonly for theinstallstage which will save 95% of the overhead. Then can only create files in install, duringinstallstage. Alternatively make it opt in and overhaul file permissions inmoss-formatto handle it.
Analyse
This needs quite an overhaul to handle our coming requirements (and stop reading files twice). The benefit of not reading files twice doesn't save that much time but opens new opportunities with a refactor.
- [ ] When reading files to generate the xxhash, size, path etc, we can also collect extra information from file. isELF? hasHashBang? isHardLink? FileType (PNG, JPG, ELF, text etc), we then no longer have to reread each file to see if it's ELF (which we will have to do again to test for #!).
- [ ] Use isELF for ELF sieve
- [ ] Add a providers key to package_definition. Basically
rundeps, but to manually add a provider. Can cut a lot of code to handle the ld additions inglibcand then easily support all future compat-symlink providers (and there will be quite a few). - [ ] Deal with hardlink detection so we can add back stripping. Not entirely sure but was seeing bad performance when rebuilding
clangwithout it (hasWl,-qrelocation info in it)- [ ] Combine debuginfo and strip functions. If doing debug files, we can strip in
llvm-objcopyto save a call tollvm-strip - [ ] Also strip comments
-R .comment
- [ ] Combine debuginfo and strip functions. If doing debug files, we can strip in
I've also considered switching from the sieve based approach to directly operating on files that the sieve target (they're pretty specific) rather than iterating each file. But provided we get isELF, hasHashBang on first read, iterating a sieve becomes basically costless and the alternative is hard to make parallel.
Emit packages
- [ ] New
zstdbindings that are fast. Use--longand tweak level (is -16 best?) - [ ] Make emitting packages parallel, compressing is not well threaded
- [ ] Compress debug files in either analyse step (parallel) or during build. Not sure which would be faster and then no need to compress debug packages (and smaller debug files for everyone). Ideally want
zstdcompression, but support is early for that. - [ ] Try sorting the content payload by file type ala https://serpentos.com/blog/2021/10/04/optimal-file-locality. Not expecting a huge gain here, but would be nice to try and works best at the compression levels we use.
Other ideas
- [ ] Plumb in ABI output
- [ ] Cache packages to a global
boulderDB. The main concern is running multiple builds. Provided we lock these so only one can handle pre-build phase at a time (when passing off to run the build stages, the next build can start). Only want the content, layout and metadata shared, and create a DB for anything else that's needed per build - [ ] Copy packages from host DBs. I think I'd rather keep the Host DBs and
boulderseparate for safety, but there's no reason you couldn't take the information from the host and reuse them (and reflink the files if supported). Could save some time and reduce the load on the servers and also copy fromboulderto host asboulderwill have newer versions of the packages.
Compress debug files in either analyse step (parallel) or during build. Not sure which would be faster and then no need to compress debug packages (and smaller debug files for everyone). Ideally want zstd compression, but support is early for that.
Compressed debug sections (zlib)
272K nano-dbginfo-7.1-4-1-x86_64.stone
Uncompressed debug sections
256K nano-dbginfo-7.1-4-1-x86_64.stone
Compression of pre-compressed assets increases the .stone size
Compression of pre-compressed assets increases the .stone size
Any use of any compression of files will increase the stone size.
The plan was in relation to the eventual use of debuginfod. Here individual files are requested by users rather than fetching the packages, so the files being smaller is most important...otherwise you spend hours downloading the debug files ala Fedora before you can even look at the valgrind.
With the files compressed there would be no need to compress the -debug packages either which can save a lot of time given how huge they can be.