CLI11 icon indicating copy to clipboard operation
CLI11 copied to clipboard

Plan for precompiled mode

Open henryiii opened this issue 6 years ago • 11 comments

Current idea:

  • All non-templated function/method bodies move to new files, such as option_fwd.inl.
  • inline gets replaced by CLI11_INLINE_DEF
  • If CLI11_COMPILE is not defined, CLI11_INLINE_DEF is inline and the .inl files get included at the end of option.hpp
  • If CLI11_COMPILE is defined, CLI11_INLINE_DEF is nothing. 0 or 1 selects whether this is the compiled file or a user.
  • The generation scripts need to be adjusted to handle this correctly; the .icc part will be wrapped in #ifs.

The CMake files will automatically compile each header with an icc file, and link to that. A user not using CMake can add a file like this:

#def CLI11_COMPILE 1
#include "CLI11.hpp"

Then use CLI11 like this:

#def CLI11_COMPILE 0
#include "CLI11.hpp"

And get the benefits of compilation. If nothing is done, nothing changes and everything is header only.


// option.hpp

#include "option_fwd.hpp"
#ifndef CLI11_COMPILE || CLI11_COMPILE==1
#include "option.inl"
#endif
// option_fwd.hpp

// Minimal required includes 
// [cli11:preinc:includes]
#include <fwd>
// [cli11:preinc:end]

// All files need this (has inline, etc)
#include "macros.hpp"

// Includes for other files in CLI11 (fwd only)
#include "app_fwd.hpp"

namespace CLI {
// [cli11:app_fwd:code]

class Option {
   // Definitions and templated functions only
};
CLI11_INLINE void f();

// [cli11:app_fwd:end]
}  // namespace CLI
// option.inl

// Currently not planning to include fwd here, since it is an inline file (depending on IDEs, possibly)

// Compile only includes
// [cli11:inc:includes]
#include <algorithm>
// [cli11:inc:end]

namespace CLI {
// [cli11:option_inl:code]
Option::Option() {} // Definitions
CLI11_INLINE void f() {} 
// [cli11:option_inl:end]
}  // namespace CLI
// option.cpp
#define CLI11_COMPILE 1 // Might as well force it here)
#include <option.hpp>

henryiii avatar Oct 27 '19 23:10 henryiii

We would need to make sure that the library worked well as a subproject in cmake as part of a bigger project, that requires a bit more care and flexibility when it has the option of a non-header only build.

Has anyone tried profiling the build to know which parts take the most time to compile?

phlptp avatar Oct 28 '19 03:10 phlptp

In CMake, it should be fairly easy; if you use the target we provide. And if you don't use CMake, it remains header only. We could also provide a flag or other method to keep the header-only CMake build.

When you add a completely trivial CLI11 program:

#include <TString.h>
#include "CLI11.hpp"

void makehist(TString input, TString folder);

int main(int argc, char** argv) {

    CLI::App app("make_histogram", "Make kernel from a hits file");
    app.option_defaults()->always_capture_default();

    TString prefix = "10pvs";
    app.add_option("prefix,--prefix", prefix);

    TString folder = "../../dat";
    app.add_option("folder,--folder", folder);

    CLI11_PARSE(app, argc, argv);

    makehist(prefix, folder);
}

It changes a <1 second build time to 10 seconds, for each file like this. If CLI11 had a compiled mode, you could add the 10 second addition just once. If there's a way to improve/fix this, that would also be great.

Distributing with compiled components (in Conan, etc.) would become harder, though they could just do the same build on demand CMake system.

henryiii avatar Oct 28 '19 14:10 henryiii

I am wondering if even in the single header mode if it would make sense to have a majority of the Validators split into a separate header. Based on some usage and issues, there is potential for a fairly useful library of Validators, but also a majority of uses/users probably don't use them, so I am wondering it if makes sense to not include all of them in the single header version but have a separate header available like Timer that includes a library of Validators.

phlptp avatar Nov 11 '19 13:11 phlptp

About ripping the lib into multiple ones ...

I feel like it may make sense to study the existing solutions, then sketch a set of unified interfaces meeting the needs of all of them. Then each lib can have a wrapper inside of it implementing the interface. Then, if one needs only basic functionality in an own program, then he can use only the header with the interface (the same for the different libs) and just link the needed library (or even do it in runtime), so swapping one impl with another would be easy.

KOLANICH avatar Jul 21 '21 05:07 KOLANICH

Single header-only mode is useful for quick prototyping but I'd really like an option to have a "regular" library - the one where the implementation can be compiled into a dynamic or static object and gets installed with a bunch of header files, and a CMake config file for easy integration. This is how I prefer to use libraries in my more mature projects for various reasons (compilation times being one of them).

jzakrzewski avatar Jul 21 '21 06:07 jzakrzewski

The plan looks sensible, except that CLI11_COMPILE should not be set by a user, instead it should be set by CMake (target_compile_definitions(CLI11 PUBLIC "-DCLI11_COMPILE=1")) and pkg-config.

KOLANICH avatar Jul 21 '21 07:07 KOLANICH

Yes, "user" in this case is usually the build system - our supported ones (CMake, png-config, Bazel, and Meson, IIRC?) would do it for you.

henryiii avatar Jul 21 '21 13:07 henryiii

Would this allow for building & installing the pre-compiled code as a shared/static library?

Our general use case is to install common third_party libs like this in a base Docker image so that we can take the compile time hit once, and simply link against the libraries for each of our services. While being able to install the headers is nice, we don't get any of the compilation time savings.

Also - this might be something I could help implement if I can find some time.

brian-finisher avatar May 10 '22 19:05 brian-finisher

This sounds like a solid and simple plan. I'm happy to contribute to make this happen.

dherrera-fb avatar Aug 08 '22 11:08 dherrera-fb

BTW, I have created a very limited CLI parsing lib called HydrArgs about a year ago. In fact it is an abstraction layer around other CLI args parsing libs, including CLI11. Currently it is not finished, because I have mostly implemented it to the extent needed in some of my projects. It defines some interface and some model of interaction between the libs actually implementing functionality and the user's app and allows to stack different parsers and/or change the parser used in one app by just choosing the right backend in runtime. I.e. if the app is linked to the runtime dispatcher, one can change using the right env args, if no args are supplied, it j8st chooses the hest available. And anyway it can be changed using patchelf and/or LD_PRELOAD.

KOLANICH avatar Aug 08 '22 13:08 KOLANICH

I've submitted a proof-of-concept PR to have an initial discussion of the implementation. I've completed the split in a separate branch (https://github.com/dherrera-fb/CLI11/tree/all-contributors/precompiled-all) and I can submit that as a PR once the POC looks good. The numbers below are based on the complete branch.

The precompiled version improves compile-time considerably but so far binary size suffers. Here are a few numbers from my local machine. I've tried with and without Link-Time-Optimization in hopes that it reduces binary size but it didn't make the new change better. Experiments ran with Apple Clang 12.0

Compile time

Compile time improves drastically in all cases.

Headers-only ninja 1075.74s user 27.64s system 1360% cpu 1:21.09 total ninja (lto) 1329.50s user 31.73s system 1365% cpu 1:39.72 total

Precompiled ninja 510.94s user 17.06s system 1055% cpu 50.029 total ninja (lto) 917.00s user 22.45s system 1310% cpu 1:11.67 total

Binary size

Binary size for all tests gets worse. I suspect it is because there are still many inline templates and they get included at least twice (in precompiled code and user code). One could still try to improve this by removing templates. image

dherrera-fb avatar Aug 09 '22 13:08 dherrera-fb

I've made all changes in https://github.com/CLIUtils/CLI11/pull/762.

dherrera-fb avatar Aug 15 '22 09:08 dherrera-fb

The diff would probably be better as a percentage - the binaries are not massively larger - 10% or so. If it's optional, it can be a user decided tradeoff.

henryiii avatar Aug 15 '22 13:08 henryiii

True. Here's the percentage version. image

I think it has merit as an optional feature, yeah. There should probably be a separate effort to reduce binary size (e.g. by reducing template usage). This still improves compilation time as intended.

dherrera-fb avatar Aug 15 '22 14:08 dherrera-fb