magnum icon indicating copy to clipboard operation
magnum copied to clipboard

Compilation time, CI time and executable size improvements

Open mosra opened this issue 7 years ago • 2 comments

For the 2018.1d release (and onwards) I'd like to focus on reducing the header and executable size, together with improving compile time (and, as a side effect, runtime performance). This was last done in 2013 (see the blog article) and while current workflow enforces enough rules to prevent worsening of this problem, it's not actively improving it either.

The ultimate goal is being able to ship useful utilities as "single-header" libraries ~~without being laughed at for compile times~~ and having compile times competitive with C header-only libs, yet staying in C++. Which, of course, means much better compile times than other C++ projects (json.hpp and Eigen, I'm looking at you).

Compile time improvements

  • [x] ~~Add (and start using) #pragma once in all headers, as it leads to measurable compile time improvement~~
    • below statistical error, extra noise, doesn't actually add any value if guards need to be present as well
  • [x] Provide a CORRADE_TARGET_LIBCXX / CORRADE_TARGET_LIBSTDCXX macro that tells me whether libc++ or libstdc++ is used (needed by the things below) -- detection using <ciso646> and _LIBCPP_VERSION, https://stackoverflow.com/questions/31657499/how-to-detect-stdlib-libc-in-the-preprocessor, there needs to be an exception for GCC < 6.1: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65473 (and a TODO to use <version> in C++20) -- mosra/corrade@b6b37fbd2150879d47e18fb9ba115e52ea6551a4
  • [x] Provide (and use) a header wrapping platform-specific forward-declaration headers for std::string -- mosra/corrade@2345195d5daf704d628a2a7db9ff4ad1dcc977da
    • [x] libc++,
    • [x] libstdc++
    • [x] MSVC STL has a full definition + typedef in <xstring>, ruling out any forward declaration
  • [x] Provide (and use) a header wrapping platfrom-specific forward-declaration headers for std::vector -- mosra/corrade@2345195d5daf704d628a2a7db9ff4ad1dcc977da
    • [x] libc++
    • [x] ~~anything in libstdc++?~~ nope
    • [x] MSVC STL has a definition with a default template argument directly in <vector>, which rules out any forward declarations
    • Replace with our own types in all internals and APIs, keep only in STL compatibility headers
      • [ ] Utility::Arguments, Utility::Configuration
      • [ ] Text::Renderer and Text::AbstractFontConverter APIs
      • [ ] Audio library
      • [x] ~~SceneGraph~~ going to be put to maintenance mode, not needed
  • [x] ~~Provide (and use) a header wrapping platform-specific forward-declaration headers for std::reference_wrapper because <functional> alone is 22k lines (and grows to 44k in C++17, WTAF!!)~~ using our own Containers::Reference instead
    • ~~<type_traits> on libstdc++, libc++ and MSVC STL as well (there it has a full definition, yay)~~
  • [x] Provide (and use) a header wrapping platform-specific forward-declaration headers for std::tuple (<tuple> is >20k lines on libstdc++) -- mosra/corrade@2345195d5daf704d628a2a7db9ff4ad1dcc977da
    • [x] <type_traits> on libstdc++
    • [x] libc++
    • [x] <utility> on MSVC STL, defined next to std::pair
    • Replace with Containers::Pair / Containers::Triple / custom structs in all internals and APIs, keep only in STL compatibility headers
      • [ ] Move from Debug.h to DebugStl.h
      • [ ] Text::AbstractFontConverter
      • [ ] Animation APIs
      • [ ] Image dataProperties() (ditch all that, use strided array views)
      • [x] ~~SceneGraph~~ going to be put to maintenance mode, not needed
  • [x] Investigate possibility (and viability) of forward declarations for other types
    • ~~does std::pair have a forward declaration somewhere? though <utility> is just 4k lines~~ nope, but we have Containers::Pair now
    • <array> is also big (~20k) but we're not really using it in public APIs -- yes, mosra/corrade@fd8030d93d7abe595e50dba6251d9cd4e9779c03
  • [x] The Utility/TypeTraits.h header (used by Debug and thus basically everything else) includes a quite heavy <iterator> that's needed only for std::begin() in libc++, it's quite big so include it conditionally only for libc++ -- mosra/corrade@9b258d768f5c17f5dd79a71d61d86187b18f223d
    • [x] it also includes <utility>, is it needed? decay() is in <type_traits> already -- mosra/corrade@9b258d768f5c17f5dd79a71d61d86187b18f223d
  • [x] Many headers conditionally include <algorithm> just to have std::min() for scalars on MSVC -- mosra/corrade@1719c57d6f7e0d9174e7f0c46c906ee2c40f084a, 563dee0436166556cb26af79a49cc380af7a9cd1
  • [x] Hide usage of <map>, <unordered_map> and other huge containers from headers, PIMPL these (no, I'm not going to implement them myself just yet) -- it's just Interconnect and Text libraries left, which need a significant cleanup anyway
  • [x] split away sorted container comparison from TestSuite/Compare/Container.h to a new header so we don't include <algorithm> there -- mosra/corrade@1719c57d6f7e0d9174e7f0c46c906ee2c40f084a
  • [x] Implement lightweight alternatives to std::unique_ptr and std::reference_wrapper -- mosra/corrade@4e7195739acd43eb8b0bc12c229ff573f47cf355 and mosra/corrade@a874478917854cdfa7a4bcc996ba829c2c39ada2
  • [x] Put out the dumpster fire that's happening with <cmath> in C++17: https://twitter.com/czmosra/status/1085993965529255936 -- mosra/corrade@115b56eb74665a0570d68fc31d9bc28e9015f319
  • [x] ~~Look into https://github.com/RPGillespie6/fastcov to have faster coverage builds~~ no longer needed, lcov 2.0 is fast enough
  • [x] ~~Investigate opting into __make_integer_seq compiler builtins instead of our own GenerateSequence, might have a significant effect on build times especially with long vectors, large matrices or constexpr constructors of the new MaterialData (#459) -- https://reviews.llvm.org/rL252036~~ converted to a O(log n) implementation in mosra/corrade@0b828147d692a472846dad16f730bc0d844886b7, which should be good enough

Executable size reduction / perf improvements (mainly WebAssembly-focused)

  • [x] Remove symbol visibility for static libraries -- mosra/corrade@7ae90319e42871efaa5c59395ee5d3e18e388e47
  • [ ] Get rid of stuff reported by -Wglobal-constructors -Wexit-time-destructors on Clang
    • [ ] GL::defaultFramebuffer,
    • [x] stuff in GL::Context -- b05c88737510cb9afd01c5c65ece2b1251abb7c2
    • [x] driver-specific workaround list -- 5f1fd752faab0dc35077bc2936b7f5eea4e3a579
      • [ ] sort it by hand to have fast lookup
  • [x] Put all separate GL function symbols into a big struct to reduce amount of exported symbols (basically the same way as Vulkan does it now) -- b580458104d5fbeaebb98ce8eff3323758c99957
  • [x] Investigate gains of -Oz with Emscripten, switch the toolchains to that, fix Emscripten closure compiler (https://github.com/mosra/magnum/issues/211) and investigate how much can it shave off the JS code (around 200 kB?) -- mentioned in the docs as of df6b414185c649d0adf58c81e4859197cd81fae2, not enabling by default since it has compile time impact
  • [ ] ~~Some preprocessor hook for Utility::Resource that's able to~~ strip license headers off shader sources
    • needs some MinifierShaderConverter
  • [ ] Reduce repetitive strings in debug output literals for enums (print the prefix just once) in progress -- 98232f383adc1ca1fdce877bc426951c23a89521
  • [ ] Try out Emscripten with minified imports/exports
  • [x] Usage of std::sort() and std::unordered_map for extensions during context creation is almost 20% of code size
    • [x] std::sort() replaced with a counting sort, saving ~8 kB -- e96996ea0169c6df860ff967ebb242da746825c4
    • [x] std::unordered_map replaced with hand-sorted compile-time array, saving ~8 kB -- 54c42dfb4dc37763d8f8c1dc7772268d6bcb5bd4, e2621fac3cf2c9b74c822b3a62d6306fcd7ae77d
  • [ ] Make it possible to compile for Emscripten with -s FILESYSTEM=0 (compile away parts of Utility::Directory, GL::Shader::addFile(), non-callback-based Trade::AbstractImporter::openFile() etc.
  • [ ] There's one entry per every global wasm constructor (affects resources and static plugin imports), reduce that somehow. Related: https://github.com/kripken/emscripten/issues/6904#issuecomment-407910016
    • [x] combine static plugin import with resource import -- mosra/corrade@ea581d2cf68c80653ec7de1ae8c81e2ac027f359
    • [ ] maybe make resource loading / plugin import explicit on static builds?
  • [ ] Investigate using brotli instead of gzip for wasm compression (Qt says all wasm-enabled browsers support it, what about Apache?)

Bigger tasks

  • [ ] Remove all usage of C++ iostreams (just using std::cout adds 250 kB to JS + WASM size) -- depends on string/stringview classes
    • [ ] Convert iostreams to C I/O + Utility::format() -- app using std::printf() has only 40 kB compared to that
      • [ ] the Utility::Debug class
      • [ ] Arguments help / usage formatting
      • [ ] Resource binary-to-hex conversion
      • [ ] std::stringstream debug redirection in many tests -- how else? append to a string? doesn't need to be performant
      • [x] file I/O in Utility::Directory -- mosra/corrade@c1a5eedc039a2d7479a8a947ecd2d6985524cb55
    • [ ] Reimplement printf-based Utility::format() without printf (float conversion with Ryū, integer conversion using "the fastest ever integer conversion" as claimed by the author of fmt) -- float32 tables in Ryū are 624 B and even float64 tables in Ryū are just 10 kB and with Utility::format() if we don't print doubles, the tables won't even get compiled in
    • [ ] Remove all uses of printf() -- a naive copypasted implementation using grisu3 was just 25 kB, shaving > 10 kB compared to printf (and being much faster)
      • [ ] Ensure dependencies (plugins) that matter for WASM don't use it (would be hard to ensure for tinygltf, ugh)
      • [ ] It gets used by libc++'s abort_handler(), patch emscripten to not do that
    • [ ] This all needs a blog post (compare to competing implementations)
  • [x] Create a direct EmscriptenApplication instead of using Sdl2Application -- should trim down at least the generated *.js file size (the library_sdl.js is 137 kB (though unminified)) #300
  • [x] Port away from tiny_gltf and json.hpp (json.hpp alone is a 400 kB header and the recent versions are almost 700 kB) -- there's a dependency-less GltfImporter since mosra/magnum-plugins@b7c4c58405e6628a09d18f29837c92622db8ad33
    • ~~tiny_gltf might be going away from json.hpp on its own (https://github.com/syoyo/tinygltf/issues/141)~~
    • ~~just the TinyGltfImporter plugin compilation alone takes around 15 seconds -- for a single file -- which is more than all other plugins combined~~
    • ~~I bet it has some effect on WASM output size as well, just don't know how much~~
  • [x] Remove hard dependncy on GL from the Text library by creating an abstract API-independent base for glyph cache -- 834c5fe40d01499755b8281c667a7402ca94583e
    • [x] That'll allow the plugins to be built and tested without needing to take care of GL/GLES/WebGL differences -- mosra/magnum-plugins@e6f879206e24ded0e5d5bdce6f122ad4f81ca546
  • [x] Make it possible to fully disable the debug output (and then define CORRADE_ASSERT to the C assert) -- needed for the single-header libs, done in mosra/corrade@64c56aa1196f8f49a1d967a7689720e0b594197a and cee530733ea43e480dbe782f2fb1358257710750
    • [x] Similarly for configuration value and tweakable literal parsers -- 64bc7f9c8e91414d7c917c1f70345bfe9ad07740 and 77a8c0c99b1ce003767159c26517ff99c3000101
  • [x] Provide STL-interfacing APIs only as an opt-in
    • [x] STL compat for pointers, references, optional (mosra/corrade@b9f52d413eec2ddbea6c7b0dd2bf34c5d1ca27b0, mosra/corrade@80ef819bb96a752eaa05030cbb6b6541c882dcf7 and mosra/corrade@69e6593f830c37d0247c0cd57ca5bfbcf0b9987b)
    • [x] STL compat for array views -- mosra/corrade@0a13f8dded7a0e3d502e02525c8de9575dd88eed
    • [x] #include <DebugStl.h> to be able to print STL types
    • [x] STL compat for string / string view -- mosra/corrade@72f652d22584534492a95743266e26a48bdb684d, mosra/corrade@cf0bd1f89f83aac142cdc5e1eaf265e72f7c5ed6

CI speedup

  • [x] A separate repository for all dependencies we currently build manually in every CI job (ANGLE, SDL for WinRT, Bullet, GLFW...) -- https://github.com/mosra/magnum-ci
    • [x] download the binaries from ci.magnum.graphics
    • [x] ~~some token authentication so it's not publicly accessible (just restricting to Travis/AppVeyor IPs is not enough, as we want to prevent mainly CI users from abusing the server)~~ turned out to not be a problem in practice
  • [x] Build the ES2/ES3 variants without code that's not API-dependent -- a lot of it still is though including e.g. SceneGraph doc snippets, so it doesn't make that much of a difference
    • [x] and remove ES2/ES3 jobs from plugins once all plugin interfaces will be API-independent -- mosra/magnum-plugins@e6f879206e24ded0e5d5bdce6f122ad4f81ca546

Long-term

  • [x] Opt-in resizing APIs for Containers::Array so we can ditch std::vector as a growable storage, eliminating it from headers completely -- mosra/corrade@3cf41e3897d3558416d52e22dcd6c84f0d34e73c and following
    • [ ] compare with other 3rd party implementations (stb stretchy buffer, folly fbvector), benchmarking with various amounts of appended data
    • [ ] virtual-memory-backed impl. https://twitter.com/molecularmusing/status/1229868598140776453
    • [x] arrayRemoveUnordered() as well -- mosra/corrade@c9089f71b7c927add9cefe76d036d51acb4f534a
  • [x] A string view class so we can get rid all the const char* / const char(&)[n] / const std::string& overloads everywhere, again eliminating std::string from headers completely -- mosra/corrade@72f652d22584534492a95743266e26a48bdb684d
    • [x] And a String class as well, with small string optimization (https://wg21.link/P1330, chapter 4) and easily convertible from/to the string view -- mosra/corrade@64f583602d1d18d8f5182e65fc565a6d964f4607
    • [ ] Convert Utility::String to use it in progress
    • [x] Make Utility::format() work with it -- mosra/corrade@ea9f21790ff88e47a4e0d830165315ecf15644e9
    • [x] move formatString() to FormatStl.h -- mosra/corrade@e941c843b3e2255f59e974d4c0b583c48b9238cb
    • Gradually convert all other APIs to use it (introduces a lot of backwards-incompatible changes, do it gradually)
      • [ ] Utility::Arguments, Utility::Configuration
      • [ ] Audio APIs
      • [ ] Text::Renderer, Text::AbstractFontConverter
    • [ ] Drop all compatibility StringStl.h includes once enough time passes

Further work

  • [x] ~~Investigate compiling with a lighter-weight STL implementation (e.g. nanostl, EASTL?) -- most of them have no type_traits, we need type_traits~~ not happening, at this point it's easier to just ditch the remaining use of STL altogether and rely on compiler-specific builtins
  • [x] ~~Can C++20(?) modules help in any way with compile times? So far I didn't see any experiment that would prove a breakthrough in compile times~~ -- probably not
  • [x] ~~The Utility/Debug.h headers will be still quite heavy after forward-declaring strings and removing <iterator> and since these get used almost everywhere, what to do?~~ no it's not, it's fine since mosra/corrade@89da382496fb84c12fa4e2ece7bbea8f9721a191 (just <utility> and <type_traits>)

Further read / references:

  • https://floooh.github.io/2016/08/27/asmjs-diet.html

mosra avatar Nov 02 '18 12:11 mosra

Progress!

  • Corrade/Utility/TypeTraits.h got simplified in mosra/corrade@9b258d768f5c17f5dd79a71d61d86187b18f223d
  • Utility::Directory is iostreams-free since mosra/corrade@c1a5eedc039a2d7479a8a947ecd2d6985524cb55
  • Containers::Pointer and Containers::Reference implemented in mosra/corrade@4e7195739acd43eb8b0bc12c229ff573f47cf355 and mosra/corrade@a874478917854cdfa7a4bcc996ba829c2c39ada2, STL compatibility for them and Optional in mosra/corrade@b9f52d413eec2ddbea6c7b0dd2bf34c5d1ca27b0, mosra/corrade@80ef819bb96a752eaa05030cbb6b6541c882dcf7 and mosra/corrade@69e6593f830c37d0247c0cd57ca5bfbcf0b9987b
  • STL compatibility for array(view) classes in mosra/corrade@0a13f8dded7a0e3d502e02525c8de9575dd88eed
  • https://github.com/mosra/magnum/issues/211 fixed on Emscripten side, waiting on it to become widespread enough before re-adding the flag to linker flags
  • Platform::EmscriptenApplication is in-progress in #300
  • tiny_gltf might be moving away from json.hpp soon -- https://github.com/syoyo/tinygltf/issues/141
  • Text doesn't have a hard dependency on GL since 834c5fe40d01499755b8281c667a7402ca94583e, all plugins now build on CI completely GL-less since mosra/magnum-plugins@e6f879206e24ded0e5d5bdce6f122ad4f81ca546
  • Containers can opt-out of including Debug since mosra/corrade@64c56aa1196f8f49a1d967a7689720e0b594197a (used mainly for the magnum-singles repo)

mosra avatar Mar 07 '19 19:03 mosra

More progress:

  • forward declaration headers for STL containers
  • splitting away STL stuff from Utility::Debug and Utility::Format
  • PIMPLing away things
  • ...

See above for the actual commit references.

mosra avatar Apr 09 '19 20:04 mosra