arrow icon indicating copy to clipboard operation
arrow copied to clipboard

GH-40735: [Packaging][CentOS] Drop support for CentOS 7

Open raulcd opened this issue 1 year ago • 12 comments
trafficstars

Rationale for this change

Because CentOS 7 will reach EOL on 2024-06-30: https://www.centos.org/download/

End-of-life 2024-06-30 (end of RHEL 7 Maintenance Support 2 Phase)

We can drop support for CentOS 7 after we release 16.0.0 because 17.0.0 will be released after 2024-06-30.

What changes are included in this PR?

Removing CentOS 7

Are these changes tested?

Yes on CI

Are there any user-facing changes?

Yes, CentOS 7 will be unsupported but no breaking code changes.

  • GitHub Issue: #40735

raulcd avatar Apr 26 '24 15:04 raulcd

:warning: GitHub issue #40735 has been automatically assigned in GitHub to PR creator.

github-actions[bot] avatar Apr 26 '24 15:04 github-actions[bot]

@assignUser there's a bunch of r jobs that seem to require CentOS 7, is this something we can drop? It also seems we are not using neither ubuntu-cpp-static nor centos-cpp-static on any job. @kou shouldn't we be exercising those? I am happy to migrate the centos-cpp-static from CentOS 7 to a newer CentOS on a different PR but seem to be unused at the moment.

raulcd avatar Apr 26 '24 15:04 raulcd

@github-actions crossbow submit -g linux

raulcd avatar Apr 26 '24 15:04 raulcd

Revision: 1d750dc31198140055ad2221a8a4710010e530d1

Submitted crossbow builds: ursacomputing/crossbow @ actions-3cd5e06635

Task Status
almalinux-8-amd64 GitHub Actions
almalinux-8-arm64 GitHub Actions
almalinux-9-amd64 GitHub Actions
almalinux-9-arm64 GitHub Actions
amazon-linux-2023-amd64 GitHub Actions
amazon-linux-2023-arm64 GitHub Actions
centos-8-stream-amd64 GitHub Actions
centos-8-stream-arm64 GitHub Actions
centos-9-stream-amd64 GitHub Actions
centos-9-stream-arm64 GitHub Actions
debian-bookworm-amd64 GitHub Actions
debian-bookworm-arm64 GitHub Actions
debian-bullseye-amd64 GitHub Actions
debian-bullseye-arm64 GitHub Actions
debian-trixie-amd64 GitHub Actions
debian-trixie-arm64 GitHub Actions
ubuntu-focal-amd64 GitHub Actions
ubuntu-focal-arm64 GitHub Actions
ubuntu-jammy-amd64 GitHub Actions
ubuntu-jammy-arm64 GitHub Actions
ubuntu-noble-amd64 GitHub Actions
ubuntu-noble-arm64 GitHub Actions

github-actions[bot] avatar Apr 26 '24 16:04 github-actions[bot]

@raulcd I'll have to take a look, I would guess we can drop it but you know how it is with cran etc.

assignUser avatar Apr 26 '24 16:04 assignUser

diff --git a/dev/tasks/linux-packages/apache-arrow/yum/arrow.spec.in b/dev/tasks/linux-packages/apache-arrow/yum/arrow.spec.in
index 3ede1814b8..258759a1ec 100644
--- a/dev/tasks/linux-packages/apache-arrow/yum/arrow.spec.in
+++ b/dev/tasks/linux-packages/apache-arrow/yum/arrow.spec.in
@@ -25,19 +25,8 @@
 %define _rhel %{?rhel:%{rhel}}%{!?rhel:0}
 %define is_rhel (%{_rhel} != 0)
 
-%define is_centos_7 (%{_rhel} == 7 && !%{is_amazon_linux})
-
 %define major_version %(echo @VERSION@ | grep -o '^[0-9]*')
 
-%define boost_version %( \
-  if [ %{_rhel} -eq 7 ]; then \
-    echo 169; \
-  fi)
-%define cmake_version %( \
-  if [ %{_rhel} -eq 7 ]; then \
-    echo 3; \
-  fi)
-
 %define lz4_requirement %( \
   if [ %{_amzn} -eq 0 ]; then \
     echo ">= 1.8.0"; \
@@ -55,31 +44,17 @@
 %define arrow_cmake_install DESTDIR="%{buildroot}" make -C %{arrow_cmake_builddir} install
 %endif
 
-%if %{is_centos_7}
-%define gcc_package devtoolset-11-gcc
-%else
-%define gcc_package gcc
-%endif
-
-%define use_flight (%{_rhel} >= 8 || %{_amzn} >= 2023)
-%define use_gandiva (%{_rhel} >= 8 || %{_amzn} >= 2023)
-%define use_gcs (%{_rhel} >= 8)
 %define use_gflags (!%{is_amazon_linux})
 ## TODO: Enable this when glog stopped depending on gflags-devel.
 # %%define use_glog (%%{_rhel} <= 8)
 %define use_glog 0
-%define use_mimalloc (%{_rhel} >= 8)
 # TODO: Enable this. This works on local but is fragile on GitHub Actions and
 # Travis CI.
 # %%define use_s3 (%%{_rhel} >= 8)
 %define use_s3 0
-%define use_vala (%{_rhel} >= 8 || %{is_amazon_linux})
 
 %define have_grpc (%{_amzn} >= 2023)
-%define have_lz4_libs (%{_rhel} >= 8 || %{_amzn} >= 2023)
 %define have_rapidjson (%{_rhel} != 8)
-%define have_re2 (%{_rhel} >= 8 || %{_amzn} >= 2023)
-%define have_thrift (%{_rhel} >= 8)
 %define have_utf8proc (%{_rhel} >= 9 || %{_amzn} >= 2023)
 
 %define enable_glib_doc (%{_rhel} >= 9 || %{is_amazon_linux})
@@ -94,16 +69,14 @@ URL:		https://arrow.apache.org/
 Source0:	https://dist.apache.org/repos/dist/release/@PACKAGE@/@PACKAGE@-%{version}/apache-@PACKAGE@-%{version}.tar.gz
 
 BuildRequires:	bison
-BuildRequires:	boost%{boost_version}-devel
+BuildRequires:	boost-devel
 BuildRequires:	brotli-devel
 BuildRequires:	bzip2-devel
-%if %{use_flight}
 BuildRequires:	c-ares-devel
-%endif
-BuildRequires:	cmake%{cmake_version}
+BuildRequires:	cmake
 BuildRequires:	curl-devel
 BuildRequires:	flex
-BuildRequires:	%{gcc_package}-c++
+BuildRequires:	gcc-c++
 %if %{use_gflags}
 BuildRequires:	gflags-devel
 %endif
@@ -115,38 +88,27 @@ BuildRequires:	glog-devel
 BuildRequires:	grpc-devel
 BuildRequires:	grpc-plugins
 %endif
-%if %{use_gcs}
 BuildRequires:	json-devel
-%endif
 BuildRequires:	libzstd-devel
+BuildRequires:	llvm-devel
 BuildRequires:	lz4-devel %{lz4_requirement}
+BuildRequires:	ncurses-devel
 BuildRequires:	ninja-build
 BuildRequires:	openssl-devel
 BuildRequires:	pkgconfig
 %if %{have_rapidjson}
 BuildRequires:	rapidjson-devel
 %endif
-%if %{have_re2}
 BuildRequires:	re2-devel
-%endif
 BuildRequires:	snappy-devel
-%if %{have_thrift}
 BuildRequires:	thrift-devel
-%endif
 %if %{have_utf8proc}
 BuildRequires:	utf8proc-devel
 %endif
 BuildRequires:	zlib-devel
 
-%if %{use_gandiva}
-BuildRequires:	llvm-devel
-BuildRequires:	ncurses-devel
-%endif
-
 BuildRequires:	gobject-introspection-devel
-%if %{use_vala}
 BuildRequires:	vala
-%endif
 
 %description
 Apache Arrow is a data processing library for analysis.
@@ -161,21 +123,13 @@ cd cpp
   -DARROW_BUILD_UTILITIES=ON \
   -DARROW_CSV=ON \
   -DARROW_DATASET=ON \
-%if %{use_flight}
   -DARROW_FLIGHT=ON \
   -DARROW_FLIGHT_SQL=ON \
-%endif
-%if %{use_gandiva}
   -DARROW_GANDIVA=ON \
-%endif
-%if %{use_gcs}
   -DARROW_GCS=ON \
-%endif
   -DARROW_HDFS=ON \
   -DARROW_JSON=ON \
-%if %{use_mimalloc}
   -DARROW_MIMALLOC=ON \
-%endif
   -DARROW_ORC=ON \
   -DARROW_PACKAGE_KIND=rpm \
   -DARROW_PARQUET=ON \
@@ -200,12 +154,7 @@ cd c_glib
 %if %{_amzn} >= 2023
   # Do nothing
 %else
-  %if %{is_centos_7}
-    # Meson 0.62.0 or later requires Python 3.7 or later.
-    pip3 install 'meson<0.62.0'
-  %else
-    pip3 install meson
-  %endif
+  pip3 install meson
 %endif
 %if %{enable_glib_doc}
   pip3 install gi-docgen
@@ -219,9 +168,7 @@ meson setup build \
 %if %{enable_glib_doc}
   -Ddoc=true \
 %endif
-%if %{use_vala}
   -Dvapi=true
-%endif
 
 LD_LIBRARY_PATH=$PWD/../cpp/%{arrow_cmake_builddir}/$cpp_build_type \
   meson compile -C build %{?_smp_mflags}
@@ -242,11 +189,7 @@ cd -
 %package -n %{name}%{major_version}-libs
 Summary:	Runtime libraries for Apache Arrow C++
 License:	Apache-2.0
-%if %{have_lz4_libs}
 Requires:	lz4-libs %{lz4_requirement}
-%else
-Requires:	lz4 %{lz4_requirement}
-%endif
 
 %description -n %{name}%{major_version}-libs
 This package contains the libraries for Apache Arrow C++.
@@ -278,18 +221,14 @@ Requires:	%{name}%{major_version}-libs = %{version}-%{release}
 Requires:	brotli-devel
 Requires:	bzip2-devel
 Requires:	curl-devel
-%if %{use_gcs}
 Requires:	json-devel
-%endif
 Requires:	libzstd-devel
 Requires:	lz4-devel %{lz4_requirement}
 Requires:	openssl-devel
 %if %{have_rapidjson}
 Requires:	rapidjson-devel
 %endif
-%if %{have_re2}
 Requires:	re2-devel
-%endif
 Requires:	snappy-devel
 %if %{have_utf8proc}
 Requires:	utf8proc-devel
@@ -308,9 +247,7 @@ Libraries and header files for Apache Arrow C++.
 %{_includedir}/arrow/
 %exclude %{_includedir}/arrow/acero/
 %exclude %{_includedir}/arrow/dataset/
-%if %{use_flight}
 %exclude %{_includedir}/arrow/flight/
-%endif
 %{_libdir}/cmake/Arrow/
 %{_libdir}/libarrow.a
 %{_libdir}/libarrow.so
@@ -390,7 +327,6 @@ Libraries and header files for Apache Arrow dataset.
 %{_libdir}/libarrow_dataset.so
 %{_libdir}/pkgconfig/arrow-dataset.pc
 
-%if %{use_flight}
 %package -n %{name}%{major_version}-flight-libs
 Summary:	C++ library for fast data transport.
 License:	Apache-2.0
@@ -462,9 +398,7 @@ Libraries and header files for Apache Arrow Flight SQL.
 %{_libdir}/libarrow_flight_sql.a
 %{_libdir}/libarrow_flight_sql.so
 %{_libdir}/pkgconfig/arrow-flight-sql.pc
-%endif
 
-%if %{use_gandiva}
 %package -n gandiva%{major_version}-libs
 Summary:	C++ library for compiling and evaluating expressions on Apache Arrow data.
 License:	Apache-2.0
@@ -498,7 +432,6 @@ Libraries and header files for Gandiva.
 %{_libdir}/libgandiva.a
 %{_libdir}/libgandiva.so
 %{_libdir}/pkgconfig/gandiva.pc
-%endif
 
 %package -n parquet%{major_version}-libs
 Summary:	Runtime libraries for Apache Parquet C++
@@ -580,9 +513,7 @@ Libraries and header files for Apache Arrow GLib.
 %license LICENSE.txt NOTICE.txt
 %{_datadir}/arrow-glib/example/
 %{_datadir}/gir-1.0/Arrow-*.gir
-%if %{use_vala}
 %{_datadir}/vala/vapi/arrow-glib.*
-%endif
 %{_includedir}/arrow-glib/
 %{_libdir}/libarrow-glib.a
 %{_libdir}/libarrow-glib.so
@@ -637,9 +568,7 @@ Libraries and header files for Apache Arrow Dataset GLib.
 %doc README.md
 %license LICENSE.txt NOTICE.txt
 %{_datadir}/gir-1.0/ArrowDataset-*.gir
-%if %{use_vala}
 %{_datadir}/vala/vapi/arrow-dataset-glib.*
-%endif
 %{_includedir}/arrow-dataset-glib/
 %{_libdir}/libarrow-dataset-glib.a
 %{_libdir}/libarrow-dataset-glib.so
@@ -660,7 +589,6 @@ Documentation for Apache Arrow dataset GLib.
 %{_docdir}/arrow-dataset-glib/
 %endif
 
-%if %{use_flight}
 %package -n %{name}%{major_version}-flight-glib-libs
 Summary:	Runtime libraries for Apache Arrow Flight GLib
 License:	Apache-2.0
@@ -692,9 +620,7 @@ Libraries and header files for Apache Arrow Flight GLib.
 %doc README.md
 %license LICENSE.txt NOTICE.txt
 %{_datadir}/gir-1.0/ArrowFlight-*.gir
-%if %{use_vala}
 %{_datadir}/vala/vapi/arrow-flight-glib.*
-%endif
 %{_includedir}/arrow-flight-glib/
 %{_libdir}/libarrow-flight-glib.a
 %{_libdir}/libarrow-flight-glib.so
@@ -746,9 +672,7 @@ Libraries and header files for Apache Arrow Flight SQL GLib.
 %doc README.md
 %license LICENSE.txt NOTICE.txt
 %{_datadir}/gir-1.0/ArrowFlightSQL-*.gir
-%if %{use_vala}
 %{_datadir}/vala/vapi/arrow-flight-sql-glib.*
-%endif
 %{_includedir}/arrow-flight-sql-glib/
 %{_libdir}/libarrow-flight-sql-glib.a
 %{_libdir}/libarrow-flight-sql-glib.so
@@ -765,12 +689,10 @@ Documentation for Apache Arrow Flight SQL GLib.
 %defattr(-,root,root,-)
 %doc README.md
 %license LICENSE.txt NOTICE.txt
-  %if %{enable_glib_doc}
+%if %{enable_glib_doc}
 %{_docdir}/arrow-flight-sql-glib/
-  %endif
 %endif
 
-%if %{use_gandiva}
 %package -n gandiva%{major_version}-glib-libs
 Summary:	Runtime libraries for Gandiva GLib
 License:	Apache-2.0
@@ -802,9 +724,7 @@ Libraries and header files for Gandiva GLib.
 %doc README.md
 %license LICENSE.txt NOTICE.txt
 %{_datadir}/gir-1.0/Gandiva-*.gir
-%if %{use_vala}
 %{_datadir}/vala/vapi/gandiva-glib.*
-%endif
 %{_includedir}/gandiva-glib/
 %{_libdir}/libgandiva-glib.a
 %{_libdir}/libgandiva-glib.so
@@ -821,9 +741,8 @@ Documentation for Gandiva GLib.
 %defattr(-,root,root,-)
 %doc README.md
 %license LICENSE.txt NOTICE.txt
-  %if %{enable_glib_doc}
+%if %{enable_glib_doc}
 %{_docdir}/gandiva-glib/
-  %endif
 %endif
 
 %package -n parquet%{major_version}-glib-libs
@@ -857,9 +776,7 @@ Libraries and header files for Apache Parquet GLib.
 %doc README.md
 %license LICENSE.txt NOTICE.txt
 %{_datadir}/gir-1.0/Parquet-*.gir
-%if %{use_vala}
 %{_datadir}/vala/vapi/parquet-glib.*
-%endif
 %{_includedir}/parquet-glib/
 %{_libdir}/libparquet-glib.a
 %{_libdir}/libparquet-glib.so

kou avatar Apr 26 '24 20:04 kou

I am happy to migrate the centos-cpp-static from CentOS 7 to a newer CentOS on a different PR

Let's work on this on a different PR. We may use AlmaLinux 8 for it.

but seem to be unused at the moment.

They are used here: https://github.com/apache/arrow/blob/15986ae5ffef2f274c04cf0d5eec2155fe6523a6/dev/tasks/r/github.packages.yml#L120-L147

kou avatar Apr 26 '24 21:04 kou

FYI: manylinux uses AlmaLinux 8 for manylinux_2_28 (CentOS 7 for manylinux2014): https://github.com/pypa/manylinux/blob/main/README.rst#docker-images

kou avatar Apr 26 '24 21:04 kou

As mentioned in #41403 I think we can remove the centos based jobs or migrate to something newer for the libarrow binaries (9? or rocky/alma not sure what our stand/choice on that mess is?) But it's probably best to do it in a follow up. IIRC we use centos 7 with dts specifically for those jobs (ossl < 3) for use on older systems but if we drop support for centos 7&8 ... cc @nealrichardson (again it seems that we need a support policy ^^)

assignUser avatar Apr 29 '24 01:04 assignUser

As mentioned in #41403 I think we can remove the centos based jobs or migrate to something newer for the libarrow binaries (9? or rocky/alma not sure what our stand/choice on that mess is?) But it's probably best to do it in a follow up. IIRC we use centos 7 with dts specifically for those jobs (ossl < 3) for use on older systems but if we drop support for centos 7&8 ... cc @nealrichardson (again it seems that we need a support policy ^^)

We build those binaries on centos 7 for the same reason that manylinux2014 did: to build against as old of a glibc as we could to ensure the broadest compatibility. I don't know the implication of upgrading to almalinux 8, but it looks like we still build manylinux2014 wheels? In which case we should keep building R for the same.

Maybe we should switch the R binary builds to use the manylinux images, like we do when building wheels? Then we can still build on the platform but not have to maintain our own centos 7 tooling.

nealrichardson avatar Apr 29 '24 13:04 nealrichardson

To minimize disruption to some of our customers, Voltron Data is interested to contribute resources to maintain CentOS 7 in the CI and release verification matrix for a few months longer. Does this present any practical problems?

ianmcook avatar May 16 '24 21:05 ianmcook

The CentOS 7's Yum repositories (including mirrors) will be removed after CentOS 7 reached EOL. (I don't know when it's happen.) We need to use https://vault.centos.org/ instead after that. It may be a bit annoying.

kou avatar May 17 '24 00:05 kou

I'm looking into doing the above over in https://github.com/apache/arrow/issues/42128.

amoeba avatar Jun 12 '24 23:06 amoeba

Can we close this PR now that https://github.com/apache/arrow/pull/42129 is merged and the project is maintaining Centos 7 support?

amoeba avatar Jun 18 '24 20:06 amoeba

Yes. But could you mention this PR in #40735 explicitly? This PR will be helpful to restart this when Voltron Data stops contributing CentOS 7 support.

kou avatar Jun 18 '24 21:06 kou

Great point, I'll do that now @kou. Thanks.

amoeba avatar Jun 18 '24 21:06 amoeba

I'm going to close this PR now that dropping support for Centos 7 has been put on hold (see https://github.com/apache/arrow/pull/41395#issuecomment-2116223200) and the relevant packaging task has been migrated to use alternative package sources (see https://github.com/apache/arrow/issues/42128). This will probably be a very useful starting point for when work resumes on https://github.com/apache/arrow/issues/40735.

amoeba avatar Jun 18 '24 21:06 amoeba

Thanks @amoeba

raulcd avatar Jun 19 '24 07:06 raulcd