valkey icon indicating copy to clipboard operation
valkey copied to clipboard

[NEW] Add CMake support

Open yangbodong22011 opened this issue 1 year ago • 21 comments

We can support both Makefile and CMake compilation methods. This requirement is also mentioned in https://github.com/valkey-io/valkey/issues/17.

  • https://cmake.org/
  • https://code.visualstudio.com/docs/cpp/cmake-linux
  • https://www.jetbrains.com/help/clion/quick-cmake-tutorial.html

yangbodong22011 avatar Mar 29 '24 07:03 yangbodong22011

Full disclosure, I've never used CMAKE and I know make files really well. Can you summarize the main benefit for supporting both at this point?

madolson avatar Mar 29 '24 15:03 madolson

major benefits of CMake I usually rely on:

  • enforces out-of-source builds. the pattern is cd project; mkdir build; cd build; cmake ..; make so your object files and binaries always get built outside the source directory. For a fully clean "reset the world" build, you just rm -rf build and start over. No build artifacts or auto-generated output exists in your source directories anymore.
  • proper automated change detection of all sources to compile targets (many Makefiles get this wrong, miss updates, require hand written manual source to target overrides, etc)
  • CMake has variables scoped to both an entire top-level project and also each sub-component nested CMake project too. All variables inherit from the top level down, so you can set top-level variables (like CFLAGS or CC) and have it used by all other imported sub-projects automatically (but sub-directories can maintain their own additional flags too which don't get overridden).
  • Supports multiple "build types" with different settings, so you can easily flip between DEBUG builds having -O0 and sanitizers enabled versus RELEASE builds having -O2 -mcpu=native. The one build type setting also cascades down to all other imported components automatically for a consistent build experience. Nobody likes thinking they are running a debug build only to find half the sub-components didn't pick up the debug flags and never recompiled properly.
  • Better status/compile output
  • Guaranteed parallel compile support regardless of how many sub-components you import because every component shares the same global build graph and is attached to the root project (sub-components can't overwrite higher level components, but sub-components can see all parent settings/variables of the higher level components)
  • More logical output specifications. you never write compiler commands or name-pattern based build targets directly, instead you use declarative add_executable(mybinary main.c libname) with add_library(libname STATIC/SHARED/MODULE <sources>) and it attaches everything together properly for change detection and building object files into libraries into binaries and proper parallel build support because it knows how the internal source to target graph was intentionally requested.
  • Better tools support like automatically generating compile_commands.json which can be passed to refactoring/linter tools for more automated validation.

mattsta avatar Mar 29 '24 18:03 mattsta

+1 on introducing CMake. Time to catch up with the outside world a bit.

PingXie avatar Mar 29 '24 20:03 PingXie

Recent events addendum: today's xz intentional backdoor exploit was injected via obfuscated configure/Makefile directives which would be much more difficult to hide in a CMake-based system: https://www.openwall.com/lists/oss-security/2024/03/29/4

I'd make an argument the xz state-actor exploit wouldn't work in a modern CMake-based project. The legacy autoconf system is just thousands of lines of weird macro and shell script and makefile hacks from 1991 we can't shake free of for some reason.

autoconf: build your projects at the speed of a 300 baud modem!

For some reason ./configure is another sacred animal developers are afraid to evict from all projects.

(cmake also replaces the entire configure system with built-in feature detection too, which is a huge plus for removing configure everywhere)

mattsta avatar Mar 29 '24 20:03 mattsta

maybe not all people is strong to just use notepad to explore the project. cmake has much better IDE support. some people (yes, it is me) depend on IDE to explore /learn/contribute to the project

wuhanck avatar Mar 30 '24 11:03 wuhanck

I don't like CMake. It adds some layers of indirection and it's hiding a lot of what's happening behind the scenes. I find it very hard to control and debug compilation flags and such things. (And we're not even talking about ctest, right? it hides completely everything of what's happening.)

I don't know if Meson is better, but it seems to be suggested as another modern alternative.

Agree that we don't want autoconf.

Personally I love Make, because it doesn't hide what's happening and I find the syntax nice and simple. Our builds are very simple so I see no reason right now to add an extra layer of abstraction to it.

~~Changing this would also break things for users, i.e. require them to change their tooling when building and installing stuff.~~

[Edit] Supporting make and CMake side by side is fine could be fine, but requires an effort to keep both of them updated and behaving the same way, i.e. not do any implicit compile flags like changing -std=c99 to -std=gnu99 (which IIRC, CMake does implicitly if you're not careful).

zuiderkwast avatar Mar 30 '24 15:03 zuiderkwast

I agree with @zuiderkwast - simple makefiles have worked fine for years. Also, autoconf/configure isn't used on this project, so it's unhelpful to compare cmake to it.

natoscott avatar Mar 31 '24 05:03 natoscott

While the xz backdoor was quite serious, incidents like that is rather rare.

On the other hand, TEABSB (Time Exhaustion Attack via Build System Bikeshedding) is a much more common and frequent threat faced by many open-source C projects in my experience. A firm and confident "No" has been proven effective in multiple studies against such attacks.

N-R-K avatar Mar 31 '24 08:03 N-R-K

make works if we are building for a specific platform(s) like Linux. the moment we start talking about different platforms, I see a few options only: 1) either accept the intersection of all platform feature sets, 2) or manually roll the platform specific detection logic; 3) or use some automation tools and cmake is a very fine solution.

Between improvised cross-platform support (which already exists in our makefile) and well-established systems, the answer is pretty clear to me.

PingXie avatar Apr 11 '24 15:04 PingXie

One disadvantage of cmake not listed, is that it adds a build dependency and another hurdle for people getting involved - which was something avoided in the project historically and part of its success I'm sure.

"manually roll platform specific logic" is a trivial thing, certainly not requiring cmake levels of complexity to solve, and there are so few dependencies for this project that switching from make to cmake will be a step backwards.

natoscott avatar Apr 11 '24 23:04 natoscott

Can you elaborate the dependency part? Is your concern about CMake not being pre-packaged as commonly as the compiler tool chain or some platforms missing the CMake support completely? Or this is something else?

I don't have a good understanding of your argument here without concrete user journey. The way I imagine a user would need to use CMake is

  1. install CMake (if it is not already bundled)
  2. run CMake to build the code

Do you consider "sudo apt-get install" or "sudo brew" an adoption barrier? Or the build process itself?

The reason why I think we are repeating the work already solved by CMake is illustrated by #295. There will likely be more cases like this going forward when different capabilities are introduced into different OS distributions. This would make the makefile quite messy IMO.

PingXie avatar Apr 12 '24 01:04 PingXie

IIUC, another pushback against CMake on this thread is the "indirection" or "abstraction" it introduces. I can relate it well to the discussion around C vs C++ and I agree that introducing C++ into the server core codebase is a non-starter at this point. I can see that we all like to imagine how the assembly code would look like in our head before we even build our source code :). That said, I am not sure if we should hold the build system at this level. As far as the build tool chain is concerned, what I am looking for is "ease-of-use" and "good enough performance". I don't know if we are saying CMake doesn't have "predictability" or not?

I have no idea about the security aspect around CMake though. But if there is inherently higher security risk with CMake than Make, sure, I will go with "makefile" without a blink of eyes.

PingXie avatar Apr 12 '24 01:04 PingXie

| Can you elaborate the dependency part?

For me, a dependency is another tool or library that is required in order to build or run. Every time another one is added, the level of complexity increases - its just an unavoidable downside. New dependencies should bring significant value (like TLS) for that added complexity. It's not an issue specifically relating to packaging of the dependency, I'm well aware cmake is well entrenched and readily available.

Its just another thing people must learn and another thing that can go wrong. cmake isn't a panacea, and of course it generates makefiles on most platforms, so you can look at it in a flippant way too: "now you have two problems". ;-)

The problem you referenced can be tackled by adding a make macro like HAVE_EPOLL2 and conditional code. We (collectively) have a tendency to over-engineer solutions and traditionally Salvatore was quite strong on this (keeping dependencies to a minimum) - much to Redis's benefit.

natoscott avatar Apr 12 '24 04:04 natoscott

adding a make macro like HAVE_EPOLL2 and conditional code.

ah, but this is what the meta-build system solves. To work automatically, it checks system features at build time, you must basically run a program to see if the symbol is available.

cmake has a built-in thing to do exactly this: try to compile a tiny program to see if symbols are available, and if it works, you set a define you can pass through to your compiler for code to see: https://cmake.org/cmake/help/latest/module/CheckCSourceCompiles.html

Every time another one is added, the level of complexity increases - its just an unavoidable downside.

true, but taken too far you end up in djb land where you don't trust your compiler because you don't trust its register allocator graph traversal implementation so you ship your own assembler with everything you write just to remove dependencies.

Its just another thing people must learn and another thing that can go wrong.

at some point we have to assume a minimum skillset on industry standard tools or else we can't advance. it took 25 years for people to advance past C89 but we can do better.

mattsta avatar Apr 12 '24 04:04 mattsta

@mattsta as stated, I am well aware what cmake can and cannot do.

My vote is to respect the original stated ideology of simplicity from Salvatore in this case, especially since few if any benefits to this (very mature) project have been raised here. There was not a strong enough case for autoconf previously, and there is not a strong case for cmake/meson/scons/bazel now.

natoscott avatar Apr 12 '24 05:04 natoscott

FWLIW, I see the meson folk think cmake scripting is "cumbersome" and "more complicated than necessary". https://mesonbuild.com/Comparisons.html

natoscott avatar Apr 12 '24 06:04 natoscott

i think it falls into the old quote of:

“There are only two kinds of languages: the ones people complain about and the ones nobody uses.”

hand written makefiles are more like hand written assembly. we use compilers for a reason (C has traditionally been referred to as a high level language because low level means hand writting assembly!), and cmake is like a makefile compiler and Make is like a build assembler.

fun trick: edit some hand written makefile and 99% of the time, there's not a rule detecting rule changes for if the Makefile itself changes (just tested this by changing -O3 to -O2 in the current Makefile and it doesn't pick up the change), so you don't realize you're not running your latest build command updates anyway without excessive .deps and .flagsdeps and .reallysecretparameterflagsdeps meta-checks everywhere (which cmake handles automatically due to it being a meta-build-system-like-thing they have been improving for 25 years for this single purpose).

plus: the benefit of much cleaner automatic out-of-source builds can't be understated. Mixing in built artifacts directly next to source files in the same directory is weird and wrong.

mattsta avatar Apr 12 '24 15:04 mattsta

My vote is to respect the original stated ideology of simplicity from Salvatore in this case, especially since few if any benefits to this (very mature) project have been raised here. There was not a strong enough case for autoconf previously, and there is not a strong case for cmake/meson/scons/bazel now.

I disagree. "respecting the original stated ideology" is neither meaningful nor actionable. Taken too far, we are essentially saying the current system is perfect, and in this sense, I consider it a dangerous guidance. Also no one is saying CMake (or C, C++, Rust, etc) is a silver bullet either. There is no point in a blind statement of "A is better than B", unless we agree on the problem to be solved first. That said, I fully agree there is a need for ROI analysis for any decision we debate about.

In this particular case, there is clearly a problem at hand, which is us hand-rolling makefile and re-inventing the wheels and this will continue. So the question for all of us is whether we continue to hand roll cross-platform support in makefile or use CMake (or whatever other meta-build systems). I know chain-saws are dangerous tools but it doesn't mean we will ignore all the safety guidance when using them.

PS, along the same line, I would like to introduce clang-format and have it automatically re-format the source code so we can all stop wasting time on nitpicking trailing spaces, etc.

PingXie avatar Apr 13 '24 20:04 PingXie

FWLIW, I see the meson folk think cmake scripting is "cumbersome" and "more complicated than necessary". https://mesonbuild.com/Comparisons.html

I don't know exactly how to interpret the claim on this page. All we need is the cross platform build support. I am not suggesting we use all the features provided by CMake, same as we don't use all the language features. There is a best practice for practically everything.

image

PingXie avatar Apr 13 '24 20:04 PingXie

PS, along the same line, I would like to introduce clang-format and have it automatically re-format the source code so we can all stop wasting time on nitpicking trailing spaces, etc.

+100 on that one. You can configure github actions to enforce the format for new PRs too (it can either automatically add a new commit with format fixes or reject a PR if the formatter has doesn't pass cleanly).

If it helps, the format I've used for years is:

clang-format -style="{BasedOnStyle: llvm, IndentWidth: 4, AllowShortFunctionsOnASingleLine: None, KeepEmptyLinesAtTheStartOfBlocks: false, InsertBraces: true; InsertNewlineAtEOF: true}"

Some of those settings used to only be in clang-tidy so you'd need two tools for good results, but clang maintainers eventually added the most common features to clang-format directly for much easier usage.

The other nice thing about clang-format is it will auto-wrap comments and auto-format macros with full width \ end of line continuation markers so everything is much easier to read.

mattsta avatar Apr 13 '24 21:04 mattsta

| "respecting the original stated ideology" is neither meaningful nor actionable.

I disagree, but that's OK. It's actionable right here and now - don't introduce unnecessary dependencies.

FWIW the ideology is laid out in valkey/MANIFESTO:

6 - We're against complexity. We believe designing systems is a fight against complexity.

| All we need is the cross platform build support.

There's no need to add new toolchain dependencies to solve this problem and it adds significant, unnecessary complexity.

Many projects take the far simpler approach of asserting 'gmake' is the only make program supported. Every platform has gmake available, and it has a feature set that gives everything needed in a project of the scale of valkey.

natoscott avatar Apr 14 '24 23:04 natoscott

For now we decided to add CMake + Make support for the time being. This means that folks that prefer CMake have it available, but it's not required to be used: https://github.com/valkey-io/valkey/pull/1196.

madolson avatar Jan 10 '25 23:01 madolson