arrow
arrow copied to clipboard
GH-40735: [Packaging][CentOS] Drop support for CentOS 7
Rationale for this change
Because CentOS 7 will reach EOL on 2024-06-30: https://www.centos.org/download/
End-of-life 2024-06-30 (end of RHEL 7 Maintenance Support 2 Phase)
We can drop support for CentOS 7 after we release 16.0.0 because 17.0.0 will be released after 2024-06-30.
What changes are included in this PR?
Removing CentOS 7
Are these changes tested?
Yes on CI
Are there any user-facing changes?
Yes, CentOS 7 will be unsupported but no breaking code changes.
- GitHub Issue: #40735
:warning: GitHub issue #40735 has been automatically assigned in GitHub to PR creator.
@assignUser there's a bunch of r jobs that seem to require CentOS 7, is this something we can drop?
It also seems we are not using neither ubuntu-cpp-static nor centos-cpp-static on any job. @kou shouldn't we be exercising those? I am happy to migrate the centos-cpp-static from CentOS 7 to a newer CentOS on a different PR but seem to be unused at the moment.
@github-actions crossbow submit -g linux
Revision: 1d750dc31198140055ad2221a8a4710010e530d1
Submitted crossbow builds: ursacomputing/crossbow @ actions-3cd5e06635
@raulcd I'll have to take a look, I would guess we can drop it but you know how it is with cran etc.
diff --git a/dev/tasks/linux-packages/apache-arrow/yum/arrow.spec.in b/dev/tasks/linux-packages/apache-arrow/yum/arrow.spec.in
index 3ede1814b8..258759a1ec 100644
--- a/dev/tasks/linux-packages/apache-arrow/yum/arrow.spec.in
+++ b/dev/tasks/linux-packages/apache-arrow/yum/arrow.spec.in
@@ -25,19 +25,8 @@
%define _rhel %{?rhel:%{rhel}}%{!?rhel:0}
%define is_rhel (%{_rhel} != 0)
-%define is_centos_7 (%{_rhel} == 7 && !%{is_amazon_linux})
-
%define major_version %(echo @VERSION@ | grep -o '^[0-9]*')
-%define boost_version %( \
- if [ %{_rhel} -eq 7 ]; then \
- echo 169; \
- fi)
-%define cmake_version %( \
- if [ %{_rhel} -eq 7 ]; then \
- echo 3; \
- fi)
-
%define lz4_requirement %( \
if [ %{_amzn} -eq 0 ]; then \
echo ">= 1.8.0"; \
@@ -55,31 +44,17 @@
%define arrow_cmake_install DESTDIR="%{buildroot}" make -C %{arrow_cmake_builddir} install
%endif
-%if %{is_centos_7}
-%define gcc_package devtoolset-11-gcc
-%else
-%define gcc_package gcc
-%endif
-
-%define use_flight (%{_rhel} >= 8 || %{_amzn} >= 2023)
-%define use_gandiva (%{_rhel} >= 8 || %{_amzn} >= 2023)
-%define use_gcs (%{_rhel} >= 8)
%define use_gflags (!%{is_amazon_linux})
## TODO: Enable this when glog stopped depending on gflags-devel.
# %%define use_glog (%%{_rhel} <= 8)
%define use_glog 0
-%define use_mimalloc (%{_rhel} >= 8)
# TODO: Enable this. This works on local but is fragile on GitHub Actions and
# Travis CI.
# %%define use_s3 (%%{_rhel} >= 8)
%define use_s3 0
-%define use_vala (%{_rhel} >= 8 || %{is_amazon_linux})
%define have_grpc (%{_amzn} >= 2023)
-%define have_lz4_libs (%{_rhel} >= 8 || %{_amzn} >= 2023)
%define have_rapidjson (%{_rhel} != 8)
-%define have_re2 (%{_rhel} >= 8 || %{_amzn} >= 2023)
-%define have_thrift (%{_rhel} >= 8)
%define have_utf8proc (%{_rhel} >= 9 || %{_amzn} >= 2023)
%define enable_glib_doc (%{_rhel} >= 9 || %{is_amazon_linux})
@@ -94,16 +69,14 @@ URL: https://arrow.apache.org/
Source0: https://dist.apache.org/repos/dist/release/@PACKAGE@/@PACKAGE@-%{version}/apache-@PACKAGE@-%{version}.tar.gz
BuildRequires: bison
-BuildRequires: boost%{boost_version}-devel
+BuildRequires: boost-devel
BuildRequires: brotli-devel
BuildRequires: bzip2-devel
-%if %{use_flight}
BuildRequires: c-ares-devel
-%endif
-BuildRequires: cmake%{cmake_version}
+BuildRequires: cmake
BuildRequires: curl-devel
BuildRequires: flex
-BuildRequires: %{gcc_package}-c++
+BuildRequires: gcc-c++
%if %{use_gflags}
BuildRequires: gflags-devel
%endif
@@ -115,38 +88,27 @@ BuildRequires: glog-devel
BuildRequires: grpc-devel
BuildRequires: grpc-plugins
%endif
-%if %{use_gcs}
BuildRequires: json-devel
-%endif
BuildRequires: libzstd-devel
+BuildRequires: llvm-devel
BuildRequires: lz4-devel %{lz4_requirement}
+BuildRequires: ncurses-devel
BuildRequires: ninja-build
BuildRequires: openssl-devel
BuildRequires: pkgconfig
%if %{have_rapidjson}
BuildRequires: rapidjson-devel
%endif
-%if %{have_re2}
BuildRequires: re2-devel
-%endif
BuildRequires: snappy-devel
-%if %{have_thrift}
BuildRequires: thrift-devel
-%endif
%if %{have_utf8proc}
BuildRequires: utf8proc-devel
%endif
BuildRequires: zlib-devel
-%if %{use_gandiva}
-BuildRequires: llvm-devel
-BuildRequires: ncurses-devel
-%endif
-
BuildRequires: gobject-introspection-devel
-%if %{use_vala}
BuildRequires: vala
-%endif
%description
Apache Arrow is a data processing library for analysis.
@@ -161,21 +123,13 @@ cd cpp
-DARROW_BUILD_UTILITIES=ON \
-DARROW_CSV=ON \
-DARROW_DATASET=ON \
-%if %{use_flight}
-DARROW_FLIGHT=ON \
-DARROW_FLIGHT_SQL=ON \
-%endif
-%if %{use_gandiva}
-DARROW_GANDIVA=ON \
-%endif
-%if %{use_gcs}
-DARROW_GCS=ON \
-%endif
-DARROW_HDFS=ON \
-DARROW_JSON=ON \
-%if %{use_mimalloc}
-DARROW_MIMALLOC=ON \
-%endif
-DARROW_ORC=ON \
-DARROW_PACKAGE_KIND=rpm \
-DARROW_PARQUET=ON \
@@ -200,12 +154,7 @@ cd c_glib
%if %{_amzn} >= 2023
# Do nothing
%else
- %if %{is_centos_7}
- # Meson 0.62.0 or later requires Python 3.7 or later.
- pip3 install 'meson<0.62.0'
- %else
- pip3 install meson
- %endif
+ pip3 install meson
%endif
%if %{enable_glib_doc}
pip3 install gi-docgen
@@ -219,9 +168,7 @@ meson setup build \
%if %{enable_glib_doc}
-Ddoc=true \
%endif
-%if %{use_vala}
-Dvapi=true
-%endif
LD_LIBRARY_PATH=$PWD/../cpp/%{arrow_cmake_builddir}/$cpp_build_type \
meson compile -C build %{?_smp_mflags}
@@ -242,11 +189,7 @@ cd -
%package -n %{name}%{major_version}-libs
Summary: Runtime libraries for Apache Arrow C++
License: Apache-2.0
-%if %{have_lz4_libs}
Requires: lz4-libs %{lz4_requirement}
-%else
-Requires: lz4 %{lz4_requirement}
-%endif
%description -n %{name}%{major_version}-libs
This package contains the libraries for Apache Arrow C++.
@@ -278,18 +221,14 @@ Requires: %{name}%{major_version}-libs = %{version}-%{release}
Requires: brotli-devel
Requires: bzip2-devel
Requires: curl-devel
-%if %{use_gcs}
Requires: json-devel
-%endif
Requires: libzstd-devel
Requires: lz4-devel %{lz4_requirement}
Requires: openssl-devel
%if %{have_rapidjson}
Requires: rapidjson-devel
%endif
-%if %{have_re2}
Requires: re2-devel
-%endif
Requires: snappy-devel
%if %{have_utf8proc}
Requires: utf8proc-devel
@@ -308,9 +247,7 @@ Libraries and header files for Apache Arrow C++.
%{_includedir}/arrow/
%exclude %{_includedir}/arrow/acero/
%exclude %{_includedir}/arrow/dataset/
-%if %{use_flight}
%exclude %{_includedir}/arrow/flight/
-%endif
%{_libdir}/cmake/Arrow/
%{_libdir}/libarrow.a
%{_libdir}/libarrow.so
@@ -390,7 +327,6 @@ Libraries and header files for Apache Arrow dataset.
%{_libdir}/libarrow_dataset.so
%{_libdir}/pkgconfig/arrow-dataset.pc
-%if %{use_flight}
%package -n %{name}%{major_version}-flight-libs
Summary: C++ library for fast data transport.
License: Apache-2.0
@@ -462,9 +398,7 @@ Libraries and header files for Apache Arrow Flight SQL.
%{_libdir}/libarrow_flight_sql.a
%{_libdir}/libarrow_flight_sql.so
%{_libdir}/pkgconfig/arrow-flight-sql.pc
-%endif
-%if %{use_gandiva}
%package -n gandiva%{major_version}-libs
Summary: C++ library for compiling and evaluating expressions on Apache Arrow data.
License: Apache-2.0
@@ -498,7 +432,6 @@ Libraries and header files for Gandiva.
%{_libdir}/libgandiva.a
%{_libdir}/libgandiva.so
%{_libdir}/pkgconfig/gandiva.pc
-%endif
%package -n parquet%{major_version}-libs
Summary: Runtime libraries for Apache Parquet C++
@@ -580,9 +513,7 @@ Libraries and header files for Apache Arrow GLib.
%license LICENSE.txt NOTICE.txt
%{_datadir}/arrow-glib/example/
%{_datadir}/gir-1.0/Arrow-*.gir
-%if %{use_vala}
%{_datadir}/vala/vapi/arrow-glib.*
-%endif
%{_includedir}/arrow-glib/
%{_libdir}/libarrow-glib.a
%{_libdir}/libarrow-glib.so
@@ -637,9 +568,7 @@ Libraries and header files for Apache Arrow Dataset GLib.
%doc README.md
%license LICENSE.txt NOTICE.txt
%{_datadir}/gir-1.0/ArrowDataset-*.gir
-%if %{use_vala}
%{_datadir}/vala/vapi/arrow-dataset-glib.*
-%endif
%{_includedir}/arrow-dataset-glib/
%{_libdir}/libarrow-dataset-glib.a
%{_libdir}/libarrow-dataset-glib.so
@@ -660,7 +589,6 @@ Documentation for Apache Arrow dataset GLib.
%{_docdir}/arrow-dataset-glib/
%endif
-%if %{use_flight}
%package -n %{name}%{major_version}-flight-glib-libs
Summary: Runtime libraries for Apache Arrow Flight GLib
License: Apache-2.0
@@ -692,9 +620,7 @@ Libraries and header files for Apache Arrow Flight GLib.
%doc README.md
%license LICENSE.txt NOTICE.txt
%{_datadir}/gir-1.0/ArrowFlight-*.gir
-%if %{use_vala}
%{_datadir}/vala/vapi/arrow-flight-glib.*
-%endif
%{_includedir}/arrow-flight-glib/
%{_libdir}/libarrow-flight-glib.a
%{_libdir}/libarrow-flight-glib.so
@@ -746,9 +672,7 @@ Libraries and header files for Apache Arrow Flight SQL GLib.
%doc README.md
%license LICENSE.txt NOTICE.txt
%{_datadir}/gir-1.0/ArrowFlightSQL-*.gir
-%if %{use_vala}
%{_datadir}/vala/vapi/arrow-flight-sql-glib.*
-%endif
%{_includedir}/arrow-flight-sql-glib/
%{_libdir}/libarrow-flight-sql-glib.a
%{_libdir}/libarrow-flight-sql-glib.so
@@ -765,12 +689,10 @@ Documentation for Apache Arrow Flight SQL GLib.
%defattr(-,root,root,-)
%doc README.md
%license LICENSE.txt NOTICE.txt
- %if %{enable_glib_doc}
+%if %{enable_glib_doc}
%{_docdir}/arrow-flight-sql-glib/
- %endif
%endif
-%if %{use_gandiva}
%package -n gandiva%{major_version}-glib-libs
Summary: Runtime libraries for Gandiva GLib
License: Apache-2.0
@@ -802,9 +724,7 @@ Libraries and header files for Gandiva GLib.
%doc README.md
%license LICENSE.txt NOTICE.txt
%{_datadir}/gir-1.0/Gandiva-*.gir
-%if %{use_vala}
%{_datadir}/vala/vapi/gandiva-glib.*
-%endif
%{_includedir}/gandiva-glib/
%{_libdir}/libgandiva-glib.a
%{_libdir}/libgandiva-glib.so
@@ -821,9 +741,8 @@ Documentation for Gandiva GLib.
%defattr(-,root,root,-)
%doc README.md
%license LICENSE.txt NOTICE.txt
- %if %{enable_glib_doc}
+%if %{enable_glib_doc}
%{_docdir}/gandiva-glib/
- %endif
%endif
%package -n parquet%{major_version}-glib-libs
@@ -857,9 +776,7 @@ Libraries and header files for Apache Parquet GLib.
%doc README.md
%license LICENSE.txt NOTICE.txt
%{_datadir}/gir-1.0/Parquet-*.gir
-%if %{use_vala}
%{_datadir}/vala/vapi/parquet-glib.*
-%endif
%{_includedir}/parquet-glib/
%{_libdir}/libparquet-glib.a
%{_libdir}/libparquet-glib.so
I am happy to migrate the
centos-cpp-staticfrom CentOS 7 to a newer CentOS on a different PR
Let's work on this on a different PR. We may use AlmaLinux 8 for it.
but seem to be unused at the moment.
They are used here: https://github.com/apache/arrow/blob/15986ae5ffef2f274c04cf0d5eec2155fe6523a6/dev/tasks/r/github.packages.yml#L120-L147
FYI: manylinux uses AlmaLinux 8 for manylinux_2_28 (CentOS 7 for manylinux2014): https://github.com/pypa/manylinux/blob/main/README.rst#docker-images
As mentioned in #41403 I think we can remove the centos based jobs or migrate to something newer for the libarrow binaries (9? or rocky/alma not sure what our stand/choice on that mess is?) But it's probably best to do it in a follow up. IIRC we use centos 7 with dts specifically for those jobs (ossl < 3) for use on older systems but if we drop support for centos 7&8 ... cc @nealrichardson (again it seems that we need a support policy ^^)
As mentioned in #41403 I think we can remove the centos based jobs or migrate to something newer for the libarrow binaries (9? or rocky/alma not sure what our stand/choice on that mess is?) But it's probably best to do it in a follow up. IIRC we use centos 7 with dts specifically for those jobs (ossl < 3) for use on older systems but if we drop support for centos 7&8 ... cc @nealrichardson (again it seems that we need a support policy ^^)
We build those binaries on centos 7 for the same reason that manylinux2014 did: to build against as old of a glibc as we could to ensure the broadest compatibility. I don't know the implication of upgrading to almalinux 8, but it looks like we still build manylinux2014 wheels? In which case we should keep building R for the same.
Maybe we should switch the R binary builds to use the manylinux images, like we do when building wheels? Then we can still build on the platform but not have to maintain our own centos 7 tooling.
To minimize disruption to some of our customers, Voltron Data is interested to contribute resources to maintain CentOS 7 in the CI and release verification matrix for a few months longer. Does this present any practical problems?
The CentOS 7's Yum repositories (including mirrors) will be removed after CentOS 7 reached EOL. (I don't know when it's happen.) We need to use https://vault.centos.org/ instead after that. It may be a bit annoying.
I'm looking into doing the above over in https://github.com/apache/arrow/issues/42128.
Can we close this PR now that https://github.com/apache/arrow/pull/42129 is merged and the project is maintaining Centos 7 support?
Yes. But could you mention this PR in #40735 explicitly? This PR will be helpful to restart this when Voltron Data stops contributing CentOS 7 support.
Great point, I'll do that now @kou. Thanks.
I'm going to close this PR now that dropping support for Centos 7 has been put on hold (see https://github.com/apache/arrow/pull/41395#issuecomment-2116223200) and the relevant packaging task has been migrated to use alternative package sources (see https://github.com/apache/arrow/issues/42128). This will probably be a very useful starting point for when work resumes on https://github.com/apache/arrow/issues/40735.
Thanks @amoeba