bsf icon indicating copy to clipboard operation
bsf copied to clipboard

Broken PCH's on Mac and Broken Unity Builds on Windows and Mac

Open cwfitzgerald opened this issue 7 years ago • 21 comments

EDITED SUMMARY PCH implemented on Windows, Linux. Still needs to be fixed for Mac. Unity implemented on Linux. Still needs to be fixed for Windows and Mac.

ORIGINAL ISSUE As I'm starting to work to integrate with bsf, I think having precompiled headers (and unity builds) will help with compile times significantly. The idiomatic way to do this with CMake is to use cotire. I've been doing some poking around and it seems perfectly possible to do. If I were to make those changes in a more proper way would you be interested in having them? I'm thinking a setup where if it detects cotire it uses it, but if it doesn't, it just pretends like nothing happened.

cwfitzgerald avatar May 31 '18 05:05 cwfitzgerald

What is cotire? I've never herd of it.

jonesmz avatar May 31 '18 05:05 jonesmz

https://github.com/sakra/cotire

Mature, CMake COmpile TIme REducer. Does both unity build and pch with minimal configuration.

cwfitzgerald avatar May 31 '18 05:05 cwfitzgerald

There's no built in precompiled header support in cmake already?

jonesmz avatar May 31 '18 05:05 jonesmz

Nope. Cotire is the way everyone does it. I mean, it is honestly the best CMake library I've ever had to use. You have to remember CMake just got LTO support, so it will be a bit until something as fancy as unity builds of precompiled headers come along.

cwfitzgerald avatar May 31 '18 05:05 cwfitzgerald

Weird. I'm surprised CMake doesn't have native support.

Oh well, it's not like I've got any say in the matter. I think it sounds like a reasonable thing to support, though.

jonesmz avatar May 31 '18 05:05 jonesmz

It should only take a day to implement everything as the only real problem is X11/x.h which defines everything in the world, but you can basically tell it to not include that in the precompiled header. I just don't want to go to the trouble of doing it all for it to not be accepted :)

cwfitzgerald avatar May 31 '18 05:05 cwfitzgerald

I'd be very interested in speeding up compile times. Reading up on cotire it sounds good, although its documentation says it automatically sets up a unity build for the specified target. Won't that hurt incremental builds?

Also it seems to be configurable per-target, I'm wondering how well will it work with main bsf target and then all the plugins dependent on that target (does only the main target get precompiled headers, or does every plugin gets their own?).

BearishSun avatar May 31 '18 07:05 BearishSun

Unity builds are separate from normal builds. When you run ninja all you get a normal build with precompiled headers (good for incremental builds), when you run ninja all_unity you get a unity build. These are good for just getting the thing built as fast as possible. I was also generally supprised by how much faster unity builds are through cotire. As they aren't actually true unity builds (they are N different builds, N being the number of cores you have) they take advantage of parallelism and can be done very quickly. The only issue is that you have to insure that everything will work when included, but that's my job :)

All targets get precompiled headers if they are over 3 source files, any less and precompiled headers won't really help (takes longer generating it then it saves in compilation). This default can be changed however to something that works better. The pch's are made up of all the headers that target includes that are not part of the local tree (so they don't precompile themselves)

If you're willing to use it, I will start working on a PR.

cwfitzgerald avatar May 31 '18 14:05 cwfitzgerald

Yep sounds good :) Very interested to see the speed up.

BearishSun avatar May 31 '18 14:05 BearishSun

So preliminary testing is done, converted everything to use precompiled headers, building the examples (on my machine):

Build Type Regular PCH Difference
GCC-Debug 363sec 250sec 1.4x
GCC-Release 455sec 386sec 1.17x
Clang-Debug 316sec 182sec 1.72x
Clang-Release 422sec 325sec 1.30x

There are some problems around clang's internal forwarding "inttypes.h" header really not wanting to be included in a pch, which I need to find a clean cross distro solution for... Other than that one error everything should work fine. I still will have to tackle unity builds though, I haven't even tried one yet.

cwfitzgerald avatar May 31 '18 20:05 cwfitzgerald

bsf fork here: https://github.com/cwfitzgerald/bsf/tree/cotire bsfExamples fork here: https://github.com/cwfitzgerald/bsfExamples/tree/cotire

cwfitzgerald avatar May 31 '18 20:05 cwfitzgerald

Thanks for the benchmarks, nice to see a noticeable speedup.

I've quickly checked out the changes, and it looks minimally intrusive which is really nice. The only thing that worries me are the explicit paths to clang/llvm folders, especially the versioned llvm ones. Does that mean things will break if user has Clang installed in a non-default path, or when some newer version is introduced?

BearishSun avatar May 31 '18 21:05 BearishSun

As of right now it does, and that is the issue with "inttypes.h" I mentioned. That solution was just to get a full proof of concept working. I need to find a clean solution for that. I think I have a couple ideas which I'll look at tomorrow. Probably something to do with getting clang's search path's then searching for "inttypes.h" and finding the directory the offending one is in.

cwfitzgerald avatar Jun 01 '18 01:06 cwfitzgerald

My latest update seems to provide a nice generic interface for that. It basically asks the compilers for their include paths by parsing the verbose logs of compiling nothing, then uses those paths to find the "inttypes.h" file in both c and c++ and then banning that file from the precompiled header. Seems to work well.

https://github.com/cwfitzgerald/bsf/commit/eb197186317d88cbdb74274f63808a9a5689a134

Not sure if I need this on mac too as I don't have access to mac, but should work the same.

cwfitzgerald avatar Jun 01 '18 06:06 cwfitzgerald

Looks like a good solution. Would be nice to be just able to ignore inttypes.h by name, but oh well.

I'd imagine its also needed on Mac. When you make a PR I'll test it out. Will it break in an obvious way if its not working properly? One thing that will probably need to change is the check for Clang since on Mac it's identified as AppleClang.

BearishSun avatar Jun 01 '18 09:06 BearishSun

It will definitely break in a very obvious way if it doesn't work. #include-ing "inttypes.h" will just completely stop working, things left completely undefined. I'm going to enable it on mac as well, as even if it doesn't suffer from the bug, having to reparse a tiny header like inttypes.h will do no harm.

Related comment about build systems, if the ninja generator is in use you really should add -fdiaganostics-color=always as ninja runs everything against an internal buffer so that it can buffer the output and clang and gcc think they are being redirected into a file and stop producing color. I will add this change into this PR as well as it's so minor that it would be silly to do another one.

Adding on to that, I think you probably should run travis using ninja. The reason I say this is it takes advantage of the parallelism they give you (they give you two cores) while still keeping errors in order due to the aforementioned buffering. I also don't really think there are any errors that show up in the ninja makefile that wouldn't also show up in the makefile generator, as the makefile generator is generally more mature and supports more things, so if it supports ninja, it will support makefiles.

Not so ninja edit: Pushed changes fixing the remaining issues with pch. Windows will have to be done at my house tonight, as I know there are some problems on there, but for now, linux/mac should be good.

cwfitzgerald avatar Jun 01 '18 14:06 cwfitzgerald

Alright, finished unity builds as well. This is where things get interesting. Times did improve, the significance of the improvement changes drastically depending on which linker you use as there is a large wait on the compilation and linking of libbsf.so. I have only included the clang times right now as I don't want to spend too much time benchmarking. Ld seems to... really shit the bed.

Build Type Regular PCH (ld) PCH (lld.ld) Unity (ld) Unity (gold.ld) Unity (lld.ld)
Clang-Debug 316sec 182sec 176sec 166 sec 103 sec 100 sec

cwfitzgerald avatar Jun 01 '18 18:06 cwfitzgerald

Somewhat off topic, but I have to point out that I consider 300 second build times to be a lightning fast :-P. My day job involves 20+ hour builds.

That being said, wow 33%+ decreases on the low end, 66%+ on the high end? That's fantastic.

jonesmz avatar Jun 01 '18 19:06 jonesmz

Really nice stuff indeed, great work.

Thanks for the suggestion for Travis, I'll test it out when I get a chance. I know I ran into out of memory errors when I tried running parallel make.

BearishSun avatar Jun 01 '18 20:06 BearishSun

Alright, hopped onto windows this weekend at home, and have fixed most of the problems on windows with PCH's (a couple warnings to go). TL;DR: Windows.h defines far and near and that messes EVERYTHING up. I had to make my own header which is included in the PCH that undefs far and near after the other headers that needed them went. I have pushed the changes I've made thus far. Timing: debug only because timing on windows has to be done manually:

Before PCH Ratio
175sec 123sec 1.42x

I have also ran into memory errors but as long as you keep the job count low (3 on travis) you should be good. I can also submit a pr with the changes to travis if you wish.

cwfitzgerald avatar Jun 02 '18 23:06 cwfitzgerald

Alright, with all the windows and linux issues sorted out, I enabled travis on the examples repository to see if mac would work as smoothly... and unfortunatly... this happened. I don't have a mac at all, so I don't have any way to solve this. I am going to make a PR with the current version so that @BearishSun can push stuff to the branch.

cwfitzgerald avatar Jun 03 '18 23:06 cwfitzgerald