bazel-buildfarm icon indicating copy to clipboard operation
bazel-buildfarm copied to clipboard

Mixed gcc9 + gcc7 support

Open mihaigalos opened this issue 5 years ago • 14 comments

Seems that BuildFarm requires gcc7 to be installed on the client side:

➜  test_buildfarm cat WORKSPACE
➜  test_buildfarm cat BUILD
cc_binary(
    name = "main",
    srcs = ["main.cpp"],
)
➜  test_buildfarm cat main.cpp
#include <iostream>

int
main( int argc, char *argv[] )
{
  std::cout << "Hello, World!" << std::endl;
}
➜  test_buildfarm bazel clean && bazel build //...
INFO: Starting clean (this may take a while). Consider using --async if the clean takes more than several minutes.
INFO: Analyzed target //:main (15 packages loaded, 48 targets configured).
INFO: Found 1 target...
Target //:main up-to-date:
  bazel-bin/main
INFO: Elapsed time: 0.714s, Critical Path: 0.36s
INFO: 6 processes: 4 internal, 2 linux-sandbox.
INFO: Build completed successfully, 6 total actions

➜  test_buildfarm bazel clean && bazel run --remote_executor=grpc://localhost:8980 //...
INFO: Starting clean (this may take a while). Consider using --async if the clean takes more than several minutes.
INFO: Invocation ID: 7c4d4ba7-17a2-4bd3-85f1-ad9ad4535a3d
INFO: Analyzed target //:main (15 packages loaded, 48 targets configured).
INFO: Found 1 target...
ERROR: /home/mihai/git/test_buildfarm/BUILD:1:10: undeclared inclusion(s) in rule '//:main':
this rule is missing dependency declarations for the following files included by 'main.cpp':
  '/usr/lib/gcc/x86_64-linux-gnu/7/include/stddef.h'
  '/usr/lib/gcc/x86_64-linux-gnu/7/include/stdarg.h'
  '/usr/lib/gcc/x86_64-linux-gnu/7/include/stdint.h'
Target //:main failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 0.360s, Critical Path: 0.03s
INFO: 5 processes: 5 internal.
FAILED: Build did NOT complete successfully
FAILED: Build did NOT complete successfully

I'm however of gcc9:

g++ (Ubuntu 9.3.0-10ubuntu2) 9.3.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Any way to work around this error?

mihaigalos avatar Oct 12 '20 06:10 mihaigalos

This is the result of your bazel client and worker systems disagreeing about what constitutes a system-provided dependency, when bazel expects to be able to validate the conditions of the exec platform on its host environment. I recall that there was a non-strict dependency check option, but that may not make this situation better, since bazel is flummoxed by a system include path that was emitted by -MD -MF that it doesn't have.

Your options are:

  • find an option which prevents this validation (worst)
  • find a way to declare to bazel something about the exec environment that satisfies this validation
  • bring your environments into alignment
  • hermeticize your toolchain into the action graph (best)

werkt avatar Oct 12 '20 13:10 werkt

@werkt : but I am running the redis, server and worker on the same machine. How could there be different configs then?

mihaigalos avatar Oct 14 '20 13:10 mihaigalos

Are you running the worker on the bare localhost machine or a container?

Assuming it is bare, the version printout you posted there is from g++ based on the content. Verify with --subcommands what program is being invoked for your compiles, and that that program is consistent with your gcc9 toolchain determined locally from bazel.

werkt avatar Oct 14 '20 13:10 werkt

I am running all 3 services in docker from bazelbuild/* latest.

I tried building abseil , here's what I get:

~/git/abseil-cpp(master) » bazel clean && bazel build //...
..
INFO: Elapsed time: 79.432s, Critical Path: 56.61s
INFO: 1504 processes: 806 internal, 698 linux-sandbox.
INFO: Build completed successfully, 1504 total actions

~/git/abseil-cpp(master) » bazel clean && bazel build --subcommands --remote_executor=grpc://localhost:8980 //...

SUBCOMMAND: # //absl/time/internal/cctz:time_zone [action 'Compiling absl/time/internal/cctz/src/time_zone_fixed.cc', configuration: 5b01d3155b8e7f474a6a1ff4b95e715fe156fa9a3b7b173994605e24e50c18bb, execution platform: @local_config_platform//:host]
(cd /home/mihai/.cache/bazel/_bazel_mihai/7ee2b12edaa6ca807e11da6dd4d1017f/execroot/com_google_absl && \
  exec env - \
    LD_LIBRARY_PATH=/opt/ros/noetic/lib \
    PATH=/opt/ros/noetic/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/usr/bin:/usr/bin:/usr/bin:/usr/bin:/usr/bin:/usr/bin \
    PWD=/proc/self/cwd \
  /usr/bin/gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer '-std=c++0x' -MD -MF bazel-out/k8-fastbuild/bin/absl/time/internal/cctz/_objs/time_zone/time_zone_fixed.pic.d '-frandom-seed=bazel-out/k8-fastbuild/bin/absl/time/internal/cctz/_objs/time_zone/time_zone_fixed.pic.o' -fPIC -iquote . -iquote bazel-out/k8-fastbuild/bin -fno-canonical-system-headers -Wno-builtin-macro-redefined '-D__DATE__="redacted"' '-D__TIMESTAMP__="redacted"' '-D__TIME__="redacted"' -c absl/time/internal/cctz/src/time_zone_fixed.cc -o bazel-out/k8-fastbuild/bin/absl/time/internal/cctz/_objs/time_zone/time_zone_fixed.pic.o)
ERROR: /home/mihai/.cache/bazel/_bazel_mihai/7ee2b12edaa6ca807e11da6dd4d1017f/external/com_google_googletest/BUILD.bazel:57:11: undeclared inclusion(s) in rule '@com_google_googletest//:gtest':
this rule is missing dependency declarations for the following files included by 'external/com_google_googletest/googlemock/src/gmock-internal-utils.cc':
  '/usr/lib/gcc/x86_64-linux-gnu/7/include/stddef.h'
  '/usr/lib/gcc/x86_64-linux-gnu/7/include/stdarg.h'
  '/usr/lib/gcc/x86_64-linux-gnu/7/include/stdint.h'
  '/usr/lib/gcc/x86_64-linux-gnu/7/include/float.h'
  '/usr/lib/gcc/x86_64-linux-gnu/7/include-fixed/limits.h'
  '/usr/lib/gcc/x86_64-linux-gnu/7/include-fixed/syslimits.h'
INFO: Elapsed time: 0.652s, Critical Path: 0.12s
INFO: 28 processes: 28 internal.
FAILED: Build did NOT complete successfully

I'm using gcc9:

~/git/abseil-cpp(master) » l $(which g++)                                                                                                                             
lrwxrwxrwx 5 root 20 Mar 14:52 /usr/bin/g++ -> g++-9

mihaigalos avatar Oct 14 '20 14:10 mihaigalos

Note the /usr/bin/gcc in that compilation. Run that with -v to confirm it is 7, and as I'm sure it is, switch that (i.e. with update-alternatives) to 9 to correct the inconsistency.

werkt avatar Oct 14 '20 14:10 werkt

~ » /usr/bin/gcc --version
gcc (Ubuntu 9.3.0-10ubuntu2) 9.3.0

mihaigalos avatar Oct 14 '20 14:10 mihaigalos

Ah, so you are running a worker container, missed that. To support gcc9, you'll need to build a new worker container on top of it. I'll prepare a wiki page for how to do this, as it requires specifying a new docker container base for a target image of the bazel-buildfarm source.

The permutations of distro+compiler make this difficult to prepare a suite of container images for this. You could also simply run the worker outside of a container on the bare metal machine to avoid all of this complication.

werkt avatar Oct 14 '20 15:10 werkt

Oh, glad we are making progress. :sweat_smile:

I'm very much looking forward to your wiki. Let me know if I can help in any way! Can't unfortunately run baremetal, since I deploy this on a cluster with i.e.: no Go env.

mihaigalos avatar Oct 14 '20 15:10 mihaigalos

.. can you please reopen this issue until the wiki is created? That way it doesn't get lost.

mihaigalos avatar Oct 14 '20 15:10 mihaigalos

Hi @werkt, did you get the change to write the Wiki for gcc9 support?

mihaigalos avatar Nov 03 '20 14:11 mihaigalos

https://github.com/bazelbuild/bazel-buildfarm/wiki/Worker-Execution-Environment preliminary and not necessarily complete. But might get you going

werkt avatar Nov 03 '20 14:11 werkt

Thanks for the update - seems there is still a config failure in the wiki. I've pulled, tagged and pushed the ubuntu20-java14 to the local registry, created the BUILD and WORKSPACE as instructed, changed the sha256 to the one output from the last docker command. No luck. I'm pretty sure this is related to rules_docker.

~/git/buildfarm-focal/bazel » ./bazelisk run :buildfarm-shard-worker-ubuntu20-java14                                                                                              mihai@mihai-pc 03/11/20 21:19:34
2020/11/03 21:19:37 Downloading https://releases.bazel.build/3.3.1/release/bazel-3.3.1-linux-x86_64...
Extracting Bazel installation...
Starting local Bazel server and connecting to it...
WARNING: ignoring LD_PRELOAD in environment.
ERROR: /home/mihai/git/buildfarm-focal/bazel/BUILD:3:1: name 'java_image' is not defined (did you mean 'java_import'?)
ERROR: Skipping ':buildfarm-shard-worker-ubuntu20-java14': no such target '//:buildfarm-shard-worker-ubuntu20-java14': target 'buildfarm-shard-worker-ubuntu20-java14' not declared in package '' defined by /home/mihai/git/buildfarm-focal/bazel/BUILD
...

mihaigalos avatar Nov 03 '20 20:11 mihaigalos

Got it a bit further with diffs to:

BUILD

$ diff BUILD_ORIG BUILD
1c1
< load("@io_bazel_rules_docker//container:container.bzl", "container_image")
---
> load("@io_bazel_rules_docker//java:image.bzl", "java_image")

WORKSPACE:

$ diff WORKSPACE_ORIG WORKSPACE
29c29
<     digest = "sha256:<sha256sum>",
---
>     digest = "sha256:11c52c57f092ef22de4042b49ed5e796520807e88298e973df8f871b05b619d3",
32a33,47
> 
> load(
>     "@io_bazel_rules_docker//repositories:repositories.bzl",
>     container_repositories = "repositories",
> )
> 
> container_repositories()
> 
> load(
>     "@io_bazel_rules_docker//java:image.bzl",
>     _java_image_repos = "repositories",
> )
> 
> 
> _java_image_repos()

Now I get:

$ ./bazelisk run //:buildfarm-shard-worker-ubuntu20-java14
...
ERROR: /home/mihai/git/buildfarm-focal/bazel/BUILD:3:11: //:buildfarm-shard-worker-ubuntu20-java14.binary: no such attribute 'files' in 'java_binary' rule
$ cat .bazelversion
3.3.1

mihaigalos avatar Nov 03 '20 20:11 mihaigalos

Apologies for the lag, and the broken docs, put that together rather quickly and didn't test it properly mid-thought.

I've updated the wiki java_image call there to hopefully describe the container correctly, let me know if that pushes you further.

werkt avatar Feb 02 '21 14:02 werkt