rules_foreign_cc icon indicating copy to clipboard operation
rules_foreign_cc copied to clipboard

Parallel build support

Open mattgodbolt opened this issue 6 years ago • 27 comments

This might be a big ask; and may be a bazel core project issue; but when consuming larger projects that need to be configure/make'd (e.g. OpenSSL, postgres, hdf5...) it would be great to be able to use parallel builds.

Of course, pragmatically one might override make_commands to have make -j12 in it...but that has disastrous consequences with other parallel builds (not in the least also on machines with fewer than 12 cores...)

In the land of gumdrops and unicorns, one could imagine bazel being able to trick out make by passing it the same magic that $(MAKE) or +make does in a real makefile, along with a -j set from the --jobs setting. That way, the make process will play nicely with the bazel build.

I realise this is a big ask: but if one doesn't ask, then maybe it won't happen! Any ideas on improving this in general also welcomed!

Thanks all!

mattgodbolt avatar Oct 18 '19 20:10 mattgodbolt

In case I was being incoherent, the POSIX jobserver for make is described in more detail here: https://www.gnu.org/software/make/manual/html_node/POSIX-Jobserver.html#POSIX-Jobserver

mattgodbolt avatar Oct 18 '19 20:10 mattgodbolt

Thank you for the feedback, I think it is worth doing. The other problem is how to balance the number of Bazel's jobs with make's jobs (also, there may be several parallel rules_foreign_cc targets being processed.)

irengrig avatar Dec 03 '19 12:12 irengrig

I’d argue that this is vital to make this project usable. For example, running the example on my machine and building OpenBLAS takes 6 minutes, but when I manually build it with make it takes 13 seconds!

HackAttack avatar Feb 02 '20 17:02 HackAttack

I will second this. We have a cmake project dependency that is significantly larger than any other component.

lawsonAGMT avatar Apr 05 '20 15:04 lawsonAGMT

I'm finding myself with a cmake project whose build time goes up by almost 4x when compared with how it would go with a manual build. As a stop gap, what would be needed to simply add a fixed parameter in the target that lets you specify how many processes you want to create for the job server?

rdelfin avatar Sep 25 '20 11:09 rdelfin

Related: bazelbuild/bazel#6477 bazelbuild/bazel#10443

HackAttack avatar Sep 26 '20 21:09 HackAttack

Not a solution yet: But if I could find a way to let cmake_external inherit file descriptors, then a wrapper Makefile would hypothetically get me to a usable workaround, where CPU over-commit would have a 1x upper bound:

build: ; +bazel @all//... --action_env="MAKEFLAGS=${MAKEFLAGS}" (where likely MAKEFLAGS=" -j --jobserver-fds=3,4" and so fds 3 and 4 need to be inherited by cmake_external)

Better yet, as mentioned above, would be to have Bazel itself implement the Make jobserver. Maybe this could be inspiration: https://github.com/olsner/jobclient "GNU make jobserver and client for e.g. shell scripts"

dhbaird avatar Oct 07 '20 01:10 dhbaird

Following up from my previous comment- came up with this really sketchy workaround. Can't say I'm proud of it. But it "works," and puts a reasonable bound on CPU over-subscription without blocking parallelism.

  1. Create a home for jobserver fifos to live:
mkdir $HOME/.jobserver

# Create jmake (to use later instead of make):
cat > ~/.jobserver/jmake << EOF
#!/bin/bash
# Note: must expand $HOME here because Bazel removes $HOME from environment:
RD=$HOME/.jobserver/rd
WR=$HOME/.jobserver/wr 
exec 3<\$RD
exec 4>\$WR
MAKEFLAGS=" -j --jobserver-fds=3,4" make "\$@"
EOF
chmod a+rx ~/.jobserver/jmake
  1. Update make_commands in your BUILD file:
cmake_external(
    ...
    make_commands = ["/path/to/home/.jobserver/jmake", "/path/to/home/.jobserver/jmake install"]
)
  1. Start a jobserver and run Bazel (adjust NPROCS to whatever bound you want):
( NPROCS=32 ;
  RD=$HOME/.jobserver/rd ;
  WR=$HOME/.jobserver/wr ;
  rm -f $RD $WR ;
  mkfifo $RD $WR ;
  echo "jobserver: ; +( cat >$RD <&3 & ( while true; do cat <$WR; sleep 0.1; done ) >&4 )" | \
      make -f - -j ${NPROCS} ) &

bazel build @all//... --sandbox_writable_path=$HOME/.jobserver

dhbaird avatar Oct 07 '20 17:10 dhbaird

You could skip step 2 by creating a custom make toolchain for make that uses your make script.

jsharpe avatar Oct 07 '20 18:10 jsharpe

Implemented @jsharpe's feedback (thanks!) and cleaned things up a bit, and then I turned it into a whole project that I swear I wasn't planning to do just one day ago. There is a questionable dependency on the "master" (or "main") branch of rules_foreign_cc that I'm not sure how to reconcile. Nevertheless here it is,

https://github.com/dhbaird/rules_foreign_cc_jobserver

dhbaird avatar Oct 08 '20 02:10 dhbaird

Calling make with an option --load-average can be a good workaround e.g:

NUMBER_OF_CPUS= "$(grep -c processor /proc/cpuinfo)"
make -j "$NUMBER_OF_CPUS"  -l "$NUMBER_OF_CPUS"

slsyy avatar Dec 09 '20 11:12 slsyy

Calling make with an option --load-average can be a good workaround e.g:

NUMBER_OF_CPUS= "$(grep -c processor /proc/cpuinfo)"
make -j "$NUMBER_OF_CPUS"  -l "$NUMBER_OF_CPUS"

that sounds like a good concise solution, how should i integrate your solution with rules_foreign_cc build script? Could you please an example? And what about Macos?

fuhailin avatar Mar 04 '21 13:03 fuhailin

@fuhailin I don't know, /proc/cpuinfo was just a quick thought. You can specifiy flags in bazel command to limit cpu like --local_cpu_resources or --jobs to reduce number of concurrent tasks. I guess it would be best to extract these information from the bazel itself

In my current build I just use hardcoded values like:

make -j 12 -l 12

slsyy avatar Mar 04 '21 15:03 slsyy

You could also use something like

make(
    name = "make_lib",
    env = {
        "CLANG_WRAPPER": "$(execpath //make_simple/code:clang_wrapper.sh)",
    },
    lib_source = "//make_simple/code:srcs",
    make_commands = [
        "make -j `nproc`",
        "make install",
    ],
    static_libraries = ["liba.a"],
    tools_deps = ["//make_simple/code:clang_wrapper.sh"],
)

Which is a slight alteration to: https://github.com/bazelbuild/rules_foreign_cc/blob/175b29c6f78cf3c78516836587c268f3d0690526/examples/make_simple/BUILD.bazel#L4-L12

UebelAndre avatar Mar 04 '21 15:03 UebelAndre

You could also use something like

make(
    name = "make_lib",
    env = {
        "CLANG_WRAPPER": "$(execpath //make_simple/code:clang_wrapper.sh)",
    },
    lib_source = "//make_simple/code:srcs",
    make_commands = [
        "make -j `nproc`",
        "make install",
    ],
    static_libraries = ["liba.a"],
    tools_deps = ["//make_simple/code:clang_wrapper.sh"],
)

Which is a slight alteration to: https://github.com/bazelbuild/rules_foreign_cc/blob/175b29c6f78cf3c78516836587c268f3d0690526/examples/make_simple/BUILD.bazel#L4-L12

overwrite make_commands for "make" function with jobs args works fine, how should i do that for "cmake" function? Could I pass jobs args with something like "build_args" or "generate_args" ATTRIBUTES?

fuhailin avatar Mar 18 '21 07:03 fuhailin

I've added https://github.com/bazelbuild/rules_foreign_cc/blob/main/examples/cmake_with_target/BUILD.bazel to demonstrate how the build_args attribute might be used. Though if you're looking for stronger support for parallelization, you should check out CMAKE_BUILD_PARALLEL_LEVEL as an env argument.

UebelAndre avatar Mar 18 '21 16:03 UebelAndre

This issue has been automatically marked as stale because it has not had any activity for 180 days. It will be closed if no further activity occurs in 30 days. Collaborators can add an assignee to keep this open indefinitely. Thanks for your contributions to rules_foreign_cc!

github-actions[bot] avatar Sep 14 '21 22:09 github-actions[bot]

This issue was automatically closed because it went 30 days without a reply since it was labeled "Can Close?"

github-actions[bot] avatar Oct 14 '21 22:10 github-actions[bot]

I think this is a relatively important feature for building CMake dependencies of any notable size. Can we keep the issue open as a feature request, please?

jwnimmer-tri avatar Oct 14 '21 23:10 jwnimmer-tri

I've added #848 as an example of how parallel make could be encapsulated in a toolchain. However this can't currently be merged due to #433 as the runfiles for the shell wrapper script aren't present with out a fix for this. I've added #849 as an outline of the fix that is needed but that PR is not ready to be merged so any help in fixing that issue would mean that we could provide a hermetic make toolchain wrapper that adds -j and -l automatically.

jsharpe avatar Jan 01 '22 23:01 jsharpe

You could also use something like

make(
    name = "make_lib",
    env = {
        "CLANG_WRAPPER": "$(execpath //make_simple/code:clang_wrapper.sh)",
    },
    lib_source = "//make_simple/code:srcs",
    make_commands = [
        "make -j `nproc`",
        "make install",
    ],
    static_libraries = ["liba.a"],
    tools_deps = ["//make_simple/code:clang_wrapper.sh"],
)

Which is a slight alteration to:

https://github.com/bazelbuild/rules_foreign_cc/blob/175b29c6f78cf3c78516836587c268f3d0690526/examples/make_simple/BUILD.bazel#L4-L12

The "make_commands" option was removed in this version Alternatively, the following two option combinations can be used. args = ["-j `nproc`"], targets = ["debug"],

docs: https://bazelbuild.github.io/rules_foreign_cc/main/make.html#make-out_static_libs

piratf avatar Apr 13 '22 05:04 piratf

This issue has been automatically marked as stale because it has not had any activity for 180 days. It will be closed if no further activity occurs in 30 days. Collaborators can add an assignee to keep this open indefinitely. Thanks for your contributions to rules_foreign_cc!

github-actions[bot] avatar Oct 10 '22 22:10 github-actions[bot]

This issue was automatically closed because it went 30 days without a reply since it was labeled "Can Close?"

github-actions[bot] avatar Nov 10 '22 22:11 github-actions[bot]

I think this is a relatively important feature for building CMake dependencies of any notable size. Can we keep the issue open as a feature request, please?

jwnimmer-tri avatar Nov 10 '22 22:11 jwnimmer-tri

The env "CMAKE_BUILD_PARALLEL_LEVEL" for cmake can solve this problem.

cmake(
    name = "xgboost",
    cache_entries = {
        "BUILD_STATIC_LIB": "ON",
    },
    lib_source = "@xgboost//:all_srcs",
    env = {
        "CMAKE_BUILD_PARALLEL_LEVEL": "32",
    },
    out_lib_dir	= "lib64",
    out_static_libs = ["libxgboost.a", "libdmlc.a"],
)

zhmc avatar Dec 12 '23 10:12 zhmc

The env "CMAKE_BUILD_PARALLEL_LEVEL" for cmake can solve this problem.

cmake(
    name = "xgboost",
    cache_entries = {
        "BUILD_STATIC_LIB": "ON",
    },
    lib_source = "@xgboost//:all_srcs",
    env = {
        "CMAKE_BUILD_PARALLEL_LEVEL": "32",
    },
    out_lib_dir	= "lib64",
    out_static_libs = ["libxgboost.a", "libdmlc.a"],
)

The issue with this approach is that the number of CPUs it was built with is included in the cache key for the artifacts. Using the make -j nprocs or using ninja is far better as it doesn't affect the cache key.

This also doesn't address the overall parallelism of the build - this only sets it per target; if you needed to build multiple such targets bazel can schedule these at the same time and over provision the worker. Also this isn't RBE friendly either; if your RBE worker is in a limited cgroup this could actually end up being slower / OOM due to the added resource usage.

jsharpe avatar Dec 13 '23 15:12 jsharpe

Another approach could be to generate Bazel BUILD files in a repository rule and perform the build with Bazel directly. This would require writing a Bazel generator into CMake (alongside Make/Ninja). I outlined this approach in #1178.

mering avatar Mar 02 '24 08:03 mering