bigtop icon indicating copy to clipboard operation
bigtop copied to clipboard

BIGTOP-3756: Add Fedora 36 option

Open yoda-mon opened this issue 2 years ago • 10 comments

Description of PR

This PR adds a Fedora 36 option to Bigtop.

  • use java 8 during build process because on fedora 36 java 17 is also installed and set as default.

How was this patch tested?

I tested on CentOS7/arm and on Ubuntu 22.04/x86.

# Build images
cd docker/bigtop-puppet/
./build.sh trunk-fedora-36
cd ../bigtop-slaves/
./build.sh trunk-fedora-36

# Build a project
cd ../..
./gradlew -POS=fedora-36 -Pdocker-run-option="--privileged" bigtop-utils-pkg-ind bigtop-jsvc-pkg-ind bigtop-groovy-pkg-ind zookeeper-pkg-ind repo-ind 

# Smoke test
cd provisioner/docker/
./docker-hadoop.sh -d  -C config_fedora-36.yaml -F docker-compose-cgroupv2.yml  \
  --create 1  --memory 8g  --enable-local-repo  --repo file:///bigtop-home/output  \
  --disable-gpg-check  --stack zookeeper --smoke-tests zookeeper

For code changes:

  • [x] Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'BIGTOP-3638. Your PR title ...')?
  • [x] Make sure that newly added files do not have any licensing issues. When in doubt refer to https://www.apache.org/licenses/

yoda-mon avatar Jul 22 '22 09:07 yoda-mon

Failed to build bigtop/puppet: fedora-36 docker image:

Errors during downloading metadata for repository 'fedora':
  - Curl error (6): Couldn't resolve host name for https://mirrors.fedoraproject.org/metalink?repo=fedora-36&arch=x86_64 [getaddrinfo() thread failed to start]
Error: Failed to download metadata for repo 'fedora': Cannot prepare internal mirrorlist: Curl error (6): Couldn't resolve host name for https://mirrors.fedoraproject.org/metalink?repo=fedora-36&arch=x86_64 [getaddrinfo() thread failed to start]
Fedora 36 - x86_64                              0.0  B/s |   0  B     00:00
Errors during downloading metadata for repository 'fedora':
  - Curl error (6): Couldn't resolve host name for https://mirrors.fedoraproject.org/metalink?repo=fedora-36&arch=x86_64 [getaddrinfo() thread failed to start]
Error: Failed to download metadata for repo 'fedora': Cannot prepare internal mirrorlist: Curl error (6): Couldn't resolve host name for https://mirrors.fedoraproject.org/metalink?repo=fedora-36&arch=x86_64 [getaddrinfo() thread failed to start]
Fedora 36 - x86_64                              0.0  B/s |   0  B     00:00
Errors during downloading metadata for repository 'fedora':
  - Curl error (6): Couldn't resolve host name for https://mirrors.fedoraproject.org/metalink?repo=fedora-36&arch=x86_64 [getaddrinfo() thread failed to start]
Error: Failed to download metadata for repo 'fedora': Cannot prepare internal mirrorlist: Curl error (6): Couldn't resolve host name for https://mirrors.fedoraproject.org/metalink?repo=fedora-36&arch=x86_64 [getaddrinfo() thread failed to start]
/tmp/puppetize.sh: line 31: puppet: command not found

guyuqi avatar Jul 25 '22 02:07 guyuqi

@guyuqi Thank you for your feedback. I tried build bigtop/puppet image again with ubuntu 22.04/x86 machine, and it succeeded(The log follow this comment).

I check around with the error message and I found this issue: fedora 35 do not work on Docker (20.10.9,https://medium.com/nttlabs/ubuntu-21-10-and-fedora-35-do-not-work-on-docker-20-10-9-1cd439d9921).

I would like to clarify the reason, Would you tell me the docker version and any other information for the environment you tested ? (On my environment, the docker engine version is 20.10.17)

$ ./build.sh trunk-fedora-36

...

Step 1/4 : FROM fedora:36
36: Pulling from library/fedora
e1deda52ffad: Already exists
Digest: sha256:cbf627299e327f564233aac6b97030f9023ca41d3453c497be2f5e8f7762d185
Status: Downloaded newer image for fedora:36
 ---> 98ffdbffd207
Step 2/4 : MAINTAINER [email protected]
 ---> Running in 85baeccc6d36
Removing intermediate container 85baeccc6d36
 ---> 8cf561be9216
Step 3/4 : COPY puppetize.sh /tmp/puppetize.sh
 ---> ef56090d802c
Step 4/4 : RUN bash /tmp/puppetize.sh
 ---> Running in a294a1c1b899
Fedora 36 - x86_64                               32 MB/s |  81 MB     00:02

...

Successfully built 54945775cf3d
Successfully tagged bigtop/puppet:trunk-fedora-36
++ rm -f Dockerfile puppetize.sh

yoda-mon avatar Jul 25 '22 04:07 yoda-mon

>>> check around with the error message and I found this issue: fedora 35 do not work on Docker (20.10.9,https://medium.com/nttlabs/ubuntu-21-10-and-fedora-35-do-not-work-on-docker-20-10-9-1cd439d9921).

Thanks, @yoda-mon The default version of docker is 20.10.2 on my local Ubuntu-20.04 host (x86_64). It works after upgrading docker version from 20.10.2 to 20.10.17.

I'd like to merged it after testing it on aarch64, thanks.

guyuqi avatar Jul 25 '22 05:07 guyuqi

The related docker images could be built successfully.

But it failed to build Hadoop against on docker image (bigtop/slaves:fedora-36).

Module: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager

x86:

[WARNING] /usr/bin/ld: CMakeFiles/oom-listener.dir/main/native/oom-listener/impl/oom_listener.c.o: relocation R_X86_64_32ainst `.rodata.str1.1' can not be used when making a PIE object; recompile with -fPIE
[WARNING] /usr/bin/ld: CMakeFiles/oom-listener.dir/main/native/oom-listener/impl/oom_listener_main.c.o: relocation R_X86_32 against `.rodata.str1.1' can not be used when making a PIE object; recompile with -fPIE
[WARNING] collect2: error: ld returned 1 exit status
[WARNING] make[2]: *** [CMakeFiles/oom-listener.dir/build.make:113: target/usr/local/bin/oom-listener] Error 1
[WARNING] make[1]: *** [CMakeFiles/Makefile2:226: CMakeFiles/oom-listener.dir/all] Error 2
[WARNING] make[1]: *** Waiting for unfinished jobs....
....
..
.

Arm64:

[WARNING] /usr/bin/ld: CMakeFiles/oom-listener.dir/main/native/oom-listener/impl/oom_listener_main.c.o: relocation R_AARC_ADR_PREL_PG_HI21 against symbol `stderr@@GLIBC_2.17' which may bind externally can not be used when making a shared obje recompile with -fPIC
[WARNING] /usr/bin/ld: CMakeFiles/oom-listener.dir/main/native/oom-listener/impl/oom_listener_main.c.o(.text+0x70): unresable R_AARCH64_ADR_PREL_PG_HI21 relocation against symbol `stderr@@GLIBC_2.17'
[WARNING] /usr/bin/ld: final link failed: bad value
[WARNING] collect2: error: ld returned 1 exit status

It seems that CFLAGS="-fPIC" should be added when building shared object.

guyuqi avatar Jul 26 '22 10:07 guyuqi

@guyuqi Thank you, I reproduced this error on both CPU architecture. I tried to build Hadoop on Fedora 35 docker image and it succeeded, so this error would be caused by glibc version.

$ docker run --rm -i bigtop/slaves:trunk-fedora-36 ldd --version
ldd (GNU libc) 2.35
Copyright (C) 2022 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Written by Roland McGrath and Ulrich Drepper.

$ docker run --rm -i bigtop/slaves:trunk-fedora-35 ldd --version
ldd (GNU libc) 2.34
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Written by Roland McGrath and Ulrich Drepper.

And as you mentioned, adding CFLAGS="-fPIC" seems to make the build work so I will investigate a little bit more and will update this PR later.

yoda-mon avatar Jul 28 '22 06:07 yoda-mon

After further investigation, I found that since fedora 36, some flags are set by default when developers build the code through rpmbuild. https://fedoraproject.org/wiki/Changes/SetBuildFlagsBuildCheck

I added opt-out option to the SPEC file rather than adding build flag at this time, and checked building Hadoop succeeded on both Fedora 36 and 35.

yoda-mon avatar Aug 02 '22 09:08 yoda-mon

@guyuqi Sorry I forgot mention to you, gentle ping.

yoda-mon avatar Aug 10 '22 06:08 yoda-mon

Thanks for the nice reminding. @yoda-mon

Since hadoop.spec was modified, let me build Hadoop not only against on Fedora-36, but also on Rockyllinux-8 and Centos-7 (Arm64/x86).

guyuqi avatar Aug 10 '22 07:08 guyuqi

It still failed to build Hadoop on Fedora-36:

[INFO] --- exec-maven-plugin:1.3.1:exec (shelldocs) @ hadoop-common ---
ERROR: yetus-dl: gpg unable to import /home/workspace/bigtop-fedora36/build/hadoop/rpm/BUILD/hadoop-3.3.4-src/patchprocess/KEYS_YETU
....
...
[INFO] Apache Hadoop Common ............................... FAILURE [  4.001 s]
...
...
.
[ERROR] Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.3.1:exec (shelldocs) on project hadoop-common: Command execution failed.: Process exited with an error: 1 

guyuqi avatar Aug 12 '22 10:08 guyuqi

Sorry for late reply, I took vacations. I'm checking on it.

yoda-mon avatar Aug 23 '22 02:08 yoda-mon

@guyuqi I tried to checked on CentOS7/arm64 and Ubuntu 22.04/x86, but could not reproduce the error. (That might be network connectivity... ?).

Through checking the process, I found the smoke test for Hadoop also failed. The reason is SysV init script is removed on fedora 36 docker image. I added the installation of init-script, and passed the smoke-test. I will add the test command to this PR's comment.

yoda-mon avatar Aug 30 '22 13:08 yoda-mon

+1, tested with an ARM64 machine on Azure, as follows:

$ facter architecture
aarch64
$ facter os
{"name"=>"CentOS", "family"=>"RedHat", "release"=>{"major"=>"7", "minor"=>"9", "full"=>"7.9.2009"}, "lsb"=>{"distcodename"=>"AltArch", "distid"=>"CentOS", "distdescription"=>"CentOS Linux release 7.9.2009 (AltArch)", "release"=>":core-4.1-aarch64:core-4.1-noarch", "distrelease"=>"7.9.2009", "majdistrelease"=>"7", "minordistrelease"=>"9"}}
$ curl -sL https://github.com/apache/bigtop/pull/961.patch | git apply
$ ./gradlew bigtop-puppet -POS=fedora-36

...

Successfully built d6fe8e41da0e
Successfully tagged bigtop/puppet:trunk-fedora-36-aarch64
+ rm -f Dockerfile puppetize.sh

BUILD SUCCESSFUL in 1m 56s
1 actionable task: 1 executed
$ ./gradlew bigtop-slaves -POS=fedora-36

...

Successfully built 28481fe37bd8
Successfully tagged bigtop/slaves:trunk-fedora-36-aarch64
+ rm -f Dockerfile

BUILD SUCCESSFUL in 30m 17s
1 actionable task: 1 executed
$ ./gradlew allclean hadoop-pkg-ind -POS=fedora-36 -Pdocker-run-option="--privileged" -Pmvn-cache-volume=true

...

> Task :hadoop-pkg
Caching disabled for task ':hadoop-pkg' because:
  Build cache is disabled
Task ':hadoop-pkg' is not up-to-date because:
  Task has not declared any outputs despite executing actions.
:hadoop-pkg (Thread[Execution worker for ':',5,main]) completed. Took 0.0 secs.

BUILD SUCCESSFUL in 37m 6s
6 actionable tasks: 6 executed
+ RESULT=0
+ mkdir -p output
+ docker cp da4972c76fa4b2b04cb7a7e84d27efbee88920a7004dc8e392dbbbb3ea234169:/bigtop/build .
+ docker cp da4972c76fa4b2b04cb7a7e84d27efbee88920a7004dc8e392dbbbb3ea234169:/bigtop/output .
+ docker rm -f da4972c76fa4b2b04cb7a7e84d27efbee88920a7004dc8e392dbbbb3ea234169
da4972c76fa4b2b04cb7a7e84d27efbee88920a7004dc8e392dbbbb3ea234169
+ '[' 0 -ne 0 ']'
+ docker rm -f da4972c76fa4b2b04cb7a7e84d27efbee88920a7004dc8e392dbbbb3ea234169
Error: No such container: da4972c76fa4b2b04cb7a7e84d27efbee88920a7004dc8e392dbbbb3ea234169

BUILD SUCCESSFUL in 37m 34s
3 actionable tasks: 3 executed

sekikn avatar Sep 09 '22 14:09 sekikn

Merged into master. Thank you for the contribution @yoda-mon!

sekikn avatar Sep 09 '22 14:09 sekikn