valkey icon indicating copy to clipboard operation
valkey copied to clipboard

Add CMake build system for valkey

Open eifrah-aws opened this issue 1 year ago • 22 comments

NOTE:

This PR was tested on Amazon Linux, Ubuntu 22.04 & macOS Sonoma 14.7 (arm64)


With this PR, users are able to build valkey using CMake.

Example usage:

Build valkey-server in Release mode with TLS enabled and using jemalloc as the allocator:

mkdir build-release
cd $_
cmake .. -DCMAKE_BUILD_TYPE=Release \
         -DCMAKE_INSTALL_PREFIX=/tmp/valkey-install \
         -DWITH_MALLOC=jemalloc -DWITH_TLS=1
make -j$(nproc) install

# start valkey
/tmp/valkey-install/bin/valkey-server

Build valkey-unit-tests:

mkdir build-release-ut
cd $_
cmake .. -DCMAKE_BUILD_TYPE=Release \
         -DWITH_MALLOC=jemalloc -DBUILD_TESTS=1
make -j$(nproc)

# Run the tests
./bin/valkey-unit-tests 

Current features supported by this PR:

  • Building against different allocators: (jemalloc, tcmalloc, tcmalloc_minimal and libc), e.g. to enable jemalloc pass -DWITH_MALLOC=jemalloc to cmake
  • OpenSSL builds (to enable TLS, pass -DWITH_TLS=1 to cmake)
  • Sanitizier: pass -DWITH_SANITIZER=<address|thread|undefined> to cmake
  • Install target + redis symbolic links
  • Build valkey-unit-tests executable
  • Standard CMake variables are supported. e.g. to install valkey under /home/you/root pass -DCMAKE_INSTALL_PREFIX=/home/you/root

Why using CMake? To list some of the advantages of using CMake:

  • Superior IDE integrations: cmake generates the file compile_commands.json which is required by clangd to get a compiler accuracy code completion (in other words: your VScode will thank you)
  • Out of the source build tree: with the current build system, object files are created all over the place polluting the build source tree, the best practice is to build the project on a separate folder
  • Multiple build types co-existing: with the current build system, it is often hard to have multiple build configurations. With cmake you can do it easily:
  • It is the de-facto standard for C/C++ project these days

More build examples:

ASAN build:

mkdir build-asan
cd $_
cmake .. -DWITH_SANITIZER=address -DWITH_MALLOC=libc
make -j$(nproc)

ASAN with jemalloc:

mkdir build-asan-jemalloc
cd $_
cmake .. -DWITH_SANITIZER=address -DWITH_MALLOC=jemalloc 
make -j$(nproc)

As seen by the previous examples, any combination is allowed and co-exist on the same source tree.

Current CMake PR status:

  • [ ] Targets:
    • [x] valkey-server
    • [x] valkey-cli
    • [x] valkey-benchmark
    • [x] valkey-sentinel
    • [x] rdma module
    • [x] valkey-unit-tests
    • [x] valkey-check-aof
    • [x] valkey-check-rdb
  • [ ] Features
    • [x] SSL
    • [x] Sanitizer
    • [x] Different allocators
    • [ ] .deb and .rpm packages
  • [ ] Python code generation
    • [x] generate release.h
    • [x] generate commands.def
    • [ ] generate generate-fmtargs.py
    • [ ] generate test_files.h

eifrah-aws avatar Sep 28 '24 06:09 eifrah-aws

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 70.69%. Comparing base (2743b7e) to head (f2be60d).

Additional details and impacted files
@@             Coverage Diff              @@
##           unstable    #1082      +/-   ##
============================================
- Coverage     70.69%   70.69%   -0.01%     
============================================
  Files           114      114              
  Lines         63076    63076              
============================================
- Hits          44594    44589       -5     
- Misses        18482    18487       +5     
Files with missing lines Coverage Δ
src/debug.c 53.05% <ø> (ø)
src/server.c 88.74% <ø> (ø)

... and 16 files with indirect coverage changes

codecov[bot] avatar Sep 28 '24 06:09 codecov[bot]

Thanks @eifrah-aws! I like this change but I think we need a @valkey-io/core-team discussion first.

PingXie avatar Sep 29 '24 01:09 PingXie

@PingXie, thanks for looking into this. In the meanwhile, I will continue adding the missing functionalities that I want to add to CMake (e.g. packaging + RDMA).

Another approach is to let both build system co-exist, until a point in time where you (and the core team) feels that its matured enough and a switch can be made (i.e. deprecate the old Makefile build system)

eifrah-aws avatar Sep 29 '24 05:09 eifrah-aws

This was previously discussed here: https://github.com/valkey-io/valkey/issues/79. The original decision was very mixed which was why I don't think we originally moved forward with it. Reading it through again, I didn't find the arguments very compelling given that our system is working well but don't feel strongly.

madolson avatar Sep 29 '24 09:09 madolson

Another approach is to let both build system co-exist, until a point in time where you (and the core team) feels that its matured enough and a switch can be made (i.e. deprecate the old Makefile build system)

I don't like this approach, it just means we have to pay all the cost of both system without getting the benefits of CMAKE. We should either switch or stick with MAKE. If we find something that doesn't work well, we can always just revert the migration to CMAKE. We do have the compatibility story to deal with, since it will be adding a new dependency that users will need to have available. I can't imagine in this day it will be much of a problem.

madolson avatar Sep 29 '24 09:09 madolson

I don't like this approach, it just means we have to pay all the cost of both system without getting the benefits of CMAKE. We should either switch or stick with MAKE. If we find something that doesn't work well, we can always just revert the migration to CMAKE. We do have the compatibility story to deal with, since it will be adding a new dependency that users will need to have available. I can't imagine in this day it will be much of a problem.

The way I see it: the end goal is to switch to CMake. I suggested this approach as an intermediate step. Ultimately, I would like to see the removal of the Makefile from the code base and exclusively using CMake.

This was previously discussed here: https://github.com/valkey-io/valkey/issues/79. The original decision was very mixed which was why I don't think we originally moved forward with it. Reading it through again, I didn't find the arguments very compelling given that our system is working well but don't feel strongly.

I was not aware of this discussion - thanks for bringing it up. Reading the OP arguments - they are basically the same as mine: parallel builds & out of source tree builds

In addition, tooling is important any developer that worked with VScode know the frustration of not having the IDE code completing when typing (or find references etc)

CMake has a built-in support for clangd in the form of compile_commands.json (the CMakeLists.txt I created generates this file automatically). With this file, clangd is able to work flawlessly

eifrah-aws avatar Sep 29 '24 10:09 eifrah-aws

The current build system works but doesn't work well. We added band-aids like https://github.com/valkey-io/valkey/blob/bb57dfe6303ae53259cf7806c696cf56c637b4e7/src/Makefile#L55 to deal with cross-platform use cases, which is error-prone and hurts readability IMO.

The one argument against CMake that is worth some discussion IMO is "dependency". I believe @zuiderkwast brought it up in the past that some organizations might not have CMake in their tool chains so this switch could be considered disruptive, even temporarily? I haven't seen a concrete case personally though.

I also don't think the value of supporting two build systems outweighs the (complexity) overhead. If we decide to switch, we should fully commit to it.

PingXie avatar Sep 29 '24 21:09 PingXie

I haven't seen a concrete case personally though.

This is probably a situation where we will have to just make the change and see what breaks. We didn't know that folks didn't have support for C11 until we switched over and people brought up that it was broken. I added the run extra test flag so we'll at least run it on all the different containers we think about.

madolson avatar Sep 30 '24 05:09 madolson

to deal with cross-platform use cases, which is error-prone and hurts readability IMO.

The current system does work though, so ultimately it's a style preference, which I don't find compelling to make a change that requires work from packagers and will break existing developers.

I also don't think the value of supporting two build systems outweighs the (complexity) overhead. If we decide to switch, we should fully commit to it.

I really feel unsure on this. We would have to make this for 9.0, and do we really want to force packagers to do work for it? I like the out of tree builds and better integration (this seems like the biggest win imo). If the goal is better integration, I would live with supporting both. So, I guess let's finish creating a fully featured CMAKE file and see how much effort it is to maintain?

madolson avatar Sep 30 '24 05:09 madolson

I guess let's finish creating a fully featured CMAKE file and see how much effort it is to maintain?

If this PR is accepted, I am willing to maintain it

eifrah-aws avatar Sep 30 '24 08:09 eifrah-aws

I guess let's finish creating a fully featured CMAKE file and see how much effort it is to maintain?

If this PR is accepted, I am willing to maintain it

That's not how it works. Whenever someone changes something like a compiler flag or some variable in a PR, they have to update the Makefile and CMakeLists.txt in that same PR.

Are you suggesting we drop the Makefile or should we support two build systems in parallel?

Just like I don't like to switch to C++ for the source code, I don't want to switch to CMake for the builds. It adds complexity and hides what's actually happening, which is something that Make doesn't, besides multiple of us maintainers have expressed that we know Make very well and are not familiar with CMake. I believe it is important that the maintainers are comfortable with the tools we're using. I was very happy that the discussion in #79 ended in April, so we could focus on developing instead of debating build systems. Now we have to spend time on that again.

Multiple build types co-existing: with the current build system, it is often hard to have multiple build configurations. With cmake you can do it easily

This is the only one of the "pros" I think is compelling. (Auto-completion is an anti-feature IMHO. If you think you need it, it means that the source code is too complex.)

Some more cons already mentioned in #79 are complexity and implicit behaviour. Risk that building with make and cmake don't give the same results (due to implicit changes of compilation flags like -std=c11 to -std=gnu11 that cmake does). The way we optionally regenerate some files like commands.def that are checked into the repo is easy to achieve with make. I don't think we would have bothered to implement this if we'd be on CMake.

zuiderkwast avatar Sep 30 '24 10:09 zuiderkwast

Are you suggesting we drop the Makefile or should we support two build systems in parallel?

Eventually, I would like replace Make with CMake

It adds complexity and hides what's actually happening, which is something that Make doesn't

This statement it subjective. I argue that CMake simplify things

Now we have to spend time on that again

Sorry about that

due to implicit changes of compilation flags like -std=c11 to -std=gnu11 that cmake does

Here is the output from valkey macOS build using CMake (notice the -std=c11):

[ 75%] Building C object src/CMakeFiles/valkey-server.dir/server.c.o
cd /Users/eifrah/devl/valkey/build-release/src && /usr/bin/gcc  -I/Users/eifrah/devl/valkey/deps/hiredis -I/Users/eifrah/devl/valkey/deps/linenoise -I/Users/eifrah/devl/valkey/deps/lua/src -I/Users/eifrah/devl/valkey/deps/hdr_histogram -I/Users/eifrah/devl/valkey/deps/fpconv -pedantic -O3 -DNDEBUG -std=c11 -arch arm64 -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX15.0.sdk -mmacosx-version-min=14.7 -MD -MT src/CMakeFiles/valkey-server.dir/server.c.o -MF CMakeFiles/valkey-server.dir/server.c.o.d -o CMakeFiles/valkey-server.dir/server.c.o -c /Users/eifrah/devl/valkey/src/server.c

Auto-completion is an anti-feature IMHO. If you think you need it, it means that the source code is too complex.

Auto-Completion is not only "display possible completions", it is a complete tool set that increases developer productivity and allows him/her to focus on what is important: coding.

A good example is: refactor symbol: sure, I can do that by doing grep and then manually replace what is correct (based on the context), or I can "right click -> rename" and click OK...

eifrah-aws avatar Sep 30 '24 11:09 eifrah-aws

It adds complexity and hides what's actually happening, which is something that Make doesn't

This statement it subjective. I argue that CMake simplify things

Complexity and simplicity are not subjective terms. Complexity can be analyzed. So can explicit vs implicit. Easy and difficult are subject terms. Easy is what you are familiar with and already know, while simple is the lack of complexities such as logical dependencies between things. (There's a really nice talk about this by Rich Hickey, the inventor of Clojure: https://www.youtube.com/watch?v=SxdOUGdseq4)

Make has very few pre-defined rules (like .o depending on .c) while CMake has a lot of such pre-defined rules. The reason things often just work in CMake is that these things are pre-defined (which is good when it does what you want and not good when it doesn't), whereas with make you have to write all the rules explicitly.

CMake generates makefiles and other kinds of files. Then you run these makefiles using make, or ninja files using ninja, etc. There's always a risk that there are differences in these different targets, since the tools are different in various ways. (The generated makefiles are perhaps not for human consumption, but to find out what cmake does, I have tried reading them in the past without much luck. :D).

due to implicit changes of compilation flags like -std=c11 to -std=gnu11 that cmake does

Here is the output from valkey macOS build using CMake (notice the -std=c11):

Yes, I see that. I believe it does change this by default though and fortunately you knew that we have to add set(CMAKE_C_EXTENSIONS OFF) to disable it.

I do think CMake is very capable of a lot of things. I just don't see the need for it in this project. Changing things imply a friction both to users and developers. It would be more justified to add CMake if we would have either of these "problems":

  1. Need to build on windows
  2. Many external dependencies

zuiderkwast avatar Sep 30 '24 13:09 zuiderkwast

One thing to keep in mind is that CMake is actively developed and new features are coming every now and then. In general this is good, but its a bit annoying that when you update the CMakeLists.txt that you need to make sure that the change wont break minimum required version. I don't think CMake has warnings about usages of newer APIs that don't exists in the min.req.version. You either need to double check the manual; keep building using the old CMake version; have a CI for this; or just making sure the minimum required version is pretty recent.

There are feature in CMake that I have liked in other projects though, like out-of-source builds and being able to generate a ninja build which are often a bit faster, but its has a maintenance cost.

bjosv avatar Sep 30 '24 13:09 bjosv

you need to make sure that the change wont break minimum required version

@bjosv I believe that this is addressable by using cmake_minimum_required(VERSION <min>..<max>)

eifrah-aws avatar Sep 30 '24 13:09 eifrah-aws

Let us consider the following criteria, if CMAKE over MAKE more advantages, we can commit it:

  1. CMAKE maintain easily?
  2. CMAKE compile faster?
  3. the bin executed file by CMAKE is smaller?
  4. CMAKE is cross-platform?
  5. anything else?

hwware avatar Oct 02 '24 10:10 hwware

Let us consider the following criteria, if CMAKE over MAKE more advantages, we can commit it:

  1. CMAKE maintain easily?
  2. CMAKE compile faster?
  3. the bin executed file by CMAKE is smaller?
  4. CMAKE is cross-platform?
  5. anything else?

According to my experience:

  1. NO
  2. I think NO (because it will produce an intermediate file)
  3. I do not know
  4. YES

hwware avatar Oct 02 '24 10:10 hwware

Let us consider the following criteria, if CMAKE over MAKE more advantages, we can commit it:

    CMAKE maintain easily?
    CMAKE compile faster?
    the bin executed file by CMAKE is smaller?
    CMAKE is cross-platform?
    anything else?

According to my experience:

NO
I think NO (because it will produce an intermediate file)
I do not know
YES
  1. Depends on your CMake knowledge vs Makefile

For example, if I had a to write a simple Makefile for a simple "Hello World" application, it would look like this:

hello_world: hello.o 
    g++ -o hello_world hello.o
    
hello.o: hello.cpp
    g++ -c hello.cpp -o hello.o
    
clean:
    rm *.o hello_world

vs CMakeLists.txt file:

add_executable(hello_world "hello.cpp")

It only gets complicated. Imagine that you want to add openssl, then you need to know other tools like pkg-config (assuming *nix only makefile), or having both debug and release build configurations. Some of the above are already provided by that one-liner of CMakeLists.txt file that I pasted above. Adding OpenSSL (which supports CMake..), I only need to add this:

find_package(OpenSSL REQUIRED)
include_directories(${OPENSSL_INCLUDE_DIR})
target_link_libraries(hello_world OpenSSL::SSL)

And this will work for Windows (MSVC), Linux, SunOS, FreeBSD, MinGW and other platforms

so, IMHO, CMake is easier to maintain and understand

  1. cmake does not compile, so it can't really be compared to make. cmake is a "Makefile generator", so it makes more sense to compare it to configure. But to answer your question, I can use cmake to generate ninja files - which is known to be faster than make, so the answer here is "YES" cmake is faster
  2. This depends on the flags you are using, if I compile the Debug target, than the executable will be large, if I use Release with -s and other optimizations, it will generate smaller executable. The answer here: cmake will generate makefile that generates executables similar to what you can achieve using a hand written makefile
  3. Yes, CMake is cross platform. i.e. it can generate Makefiles, Visual Studio solutions, Xcode project files, Ninja files and other tools, NMake makefiles and other goodies

anything else?

What I mentioned earlier:

  • Out of the source build - this one, IMO, is big plus on why people should use CMake
  • De-facto standard for C++/C build systems, even in this repo, someone made a choice to use CMake for external users who are using hiredis
  • Multiple build configurations can co-exists. Super useful when profiling code or comparing changes
  • Better tooling integration (YouCompleteMe for VIM lovers, VScode and other tools)

eifrah-aws avatar Oct 02 '24 14:10 eifrah-aws

add_executable(hello_world "hello.cpp")

With this CMake line, I have to go read the documentation to understand add_executable call. When I have to modify something, I need to figure out how to modify "add_executable". And if "add_executable" does not support it then? Is this really better?

parthpatel avatar Oct 03 '24 00:10 parthpatel

With this CMake line, I have to go read the documentation to understand add_executable call.

CMake has an excellent documentation + great search result in Stackoverflow

And if "add_executable" does not support it then?

Anything that can be done with Makefile can be done with CMake (you can always fallback to custom rules that basically can execute a shell line like Makefile does)

But don't trust my word on that, many complex projects are using CMake without issues for years now.

To list some from the top of my head:

Obviously, the list is too extensive to cover. But my point is: if all of these projects are using CMake without an issue, we can do it too.

Is this really better?

IMO, yes.

BTW: you can always run the generated Makefile with make VERBOSE=1 to see the actual build line (similar to V=1 in valkey's Makefiles)

eifrah-aws avatar Oct 03 '24 06:10 eifrah-aws

add_executable(hello_world "hello.cpp")

With this CMake line, I have to go read the documentation to understand add_executable call. When I have to modify something, I need to figure out how to modify "add_executable". And if "add_executable" does not support it then? Is this really better?

A/ The last person that actually added an executable was probably Salvatore adding redis-benchmark. B/ Compared to some of Makefile more obscure nuances add_executable seems pretty self explanatory even without reading the documentation C/ In the unlikely future event that you will want to add an executable, you will have to look on examples and read documentation (and now days just ask chatGPT) in order to do it right, just like you would for Makefile.

tldr: of course there is some learning curve, and having experienced both I think CMake learning curve is non existent to a person using tools like VS Code, very small to a command line user following a "how to", and still pretty small to anyone looking to maintain and improve it. i.e. I trust everyone in this forum to handle it in 5 min even if they never seen a CMake in their life...

But on the other side you get all the advantages listed.

Here is an interesting graph by Google trends image

CMake must be doing something right ...

eliblight avatar Oct 03 '24 07:10 eliblight

The one argument against CMake that is worth some discussion IMO is "dependency". I believe @zuiderkwast brought it up in the past that some organizations might not have CMake in their tool chains so this switch could be considered disruptive, even temporarily? I haven't seen a concrete case personally though.

We had a conversation around the same aspect internally and it's not just about organizations IMHO, it's also about a new developer trying to hack on valkey. We don't want to disrupt that experience and facing challenges around setting up Cmake (which Cmake) in the first place. This is where the project shined for me personally where you can just download the source and build it. No dependencies altogether. On that line, @madolson / @ranshid mentioned we could check in the generated Makefile to the source and still provide the same experience to devs unfamiliar with the build tooling.

hpatro avatar Oct 09 '24 21:10 hpatro

On that line, @madolson / @ranshid mentioned we could check in the generated Makefile to the source and still provide the same experience to devs unfamiliar with the build tooling.

@hpatro not possible.

  1. The generated Makefiles are generated for the host target (for example, you might see flags like -march=armv8.1-a or full path). In this regard, cmake is very much similar to configure - you are not expected to save these makefiles
  2. The generated Makefiles are "complex" and include other dependency files. For example, if you run the build on macOS, you will have somewhere a file name compiler_depends.make with list of files that are probably only exist on your machine

eifrah-aws avatar Oct 13 '24 15:10 eifrah-aws

@eifrah-aws Thanks for clarifying. 👍

hpatro avatar Oct 13 '24 19:10 hpatro

Decision is that Eran is going to continue finishing the the porting. For some period of time we will maintain both Make and CMake. At some point in the future we will pick one of the two to maintain long term.

madolson avatar Oct 14 '24 15:10 madolson

@eifrah-aws Can you create an issue for 8.1 to make a decision if we need to mark CMake support as experimental for 8.1. We will make it experimental if we aren't convinced we will support it long term. If we intend to support it long term, we don't need to mark it as experimental.

madolson avatar Oct 14 '24 15:10 madolson

@madolson @PingXie @zuiderkwast the PR is ready for review. The content of the PR is specified in the first PR comment (I kept it up-to-date with the progress)

One thing that is still worth discussing is having a CMake dedicated CI. Ultimately, I would like to see a build at least on Ubuntu with multiple configurations, for example:

  • release + jemalloc
  • release + libc
  • release + libc + SSL
  • release + jemalloc + SSL

eifrah-aws avatar Oct 18 '24 15:10 eifrah-aws

@PingXie Can you take a look first since your more familiar with CMake and might be able to cover stuff more quickly than I can.

madolson avatar Oct 18 '24 22:10 madolson

This PR LGTM overall. @eifrah-aws, please feel free to tag me when you reopen this PR and I will provide the detailed feedback.

PingXie avatar Oct 20 '24 20:10 PingXie

This PR LGTM overall. @eifrah-aws, please feel free to tag me when you reopen this PR and I will provide the detailed feedback.

@PingXie I made the unfortunate mistake of clicking the "Sync Fork" button on my repo which caused a mess in the commit hisotry, so I had to re-create the branch. The unfortunate side effect was that the PR got closed... 😄

Can you re-open it? or should I recreate it (and reference this one to retain the discussion history?)

eifrah-aws avatar Oct 20 '24 20:10 eifrah-aws