ci: add debian package Apache Cloudberry (incubating) build and test workflow
Introduces GitHub Actions workflow for Apache Cloudberry (incubating) build debian package:
In https://github.com/apache/cloudberry/pull/1359 I have added docker build and test containers for ubuntu 22.04. Here I add CI workflow to check if we still could create and install debian package for ubuntu 22.04.
To do this:
- use devops package for build cloudberry binary inside docker build container
- create debian package for builded binary
- upload artifacts to github actions
- download artifacts to ubuntu test container
- install downloaded debian packages
- check packages for integrity
@tuhaihe Here I want to add test whether debian package builds successfully, please review.
Cool, thanks! Will work with @edespino on this.
You can see the Depends field lists python 2 artifact dependencies:
cbadmin@cdw:~$ dpkg -I apache-cloudberry-db-incubating_99.0.0-1-1.17dfe844_amd64.deb
new Debian package, version 2.0.
size 20350386 bytes: control archive=62494 bytes.
1899 bytes, 32 lines control
249906 bytes, 2629 lines md5sums
143 bytes, 12 lines * postinst #!/bin/bash
201 bytes, 13 lines * preinst #!/bin/bash
218 bytes, 5 lines shlibs
72 bytes, 2 lines triggers
Package: apache-cloudberry-db-incubating
Version: 99.0.0-1-1.17dfe844
Architecture: amd64
Maintainer: Apache Cloudberry (Incubating) <[email protected]>
Installed-Size: 76800
Depends: curl, cgroup-tools, iputils-ping, krb5-multidev, less, libapr1, libbz2-1.0, libcurl4, libcurl3-gnutls, libevent-2.1-7, libreadline8, libxml2, libyaml-0-2, lib
ldap-2.5-0, libzstd1, libcgroup1, libssl3, libpam0g, libxerces-c3.2, locales, net-tools, openssh-client, openssh-server, openssl, python-six, python2.7, python2.7-dev,
rsync, wget, zlib1g, libuv1
Provides: apache-cloudberry-db
Section: database
Description: Apache Cloudberry (incubating) is an advanced, open-source, massively
parallel processing (MPP) data warehouse developed from PostgreSQL and
Greenplum. It is designed for high-performance analytics on
large-scale data sets, offering powerful analytical capabilities and
enhanced security features.
Key Features:
- Massively parallel processing for optimized performance
- Advanced analytics for complex data processing
- Integration with ETL and BI tools
- Compatibility with multiple data sources and formats
- Enhanced security features
Apache Cloudberry supports both batch processing and real-time data
warehousing, making it a versatile solution for modern data
environments.
Apache Cloudberry (incubating) is an effort undergoing incubation at
the Apache Software Foundation (ASF), sponsored by the Apache
Incubator PMC.
Incubation is required of all newly accepted projects until a further
review indicates that the infrastructure, communications, and decision
making process have stabilized in a manner consistent with other
successful ASF projects.
While incubation status is not necessarily a reflection of the
completeness or stability of the code, it does indicate that the
project has yet to be fully endorsed by the ASF.
cbadmin@cdw:~$
On a relatively bare/clean Ubuntu 22.04 system, I installed Cloudberry from the deb file generated from the build. Running gpinitsystem failed as there is a runtime dependency (libprotofuf.so.23) which is not found.
ubuntu@cdw:~$ source /usr/cloudberry-db/cloudberry-env.sh
ubuntu@cdw:~$ ldd /usr/cloudberry-db/lib/postgresql/pax.so
linux-vdso.so.1 (0x00007ffe559b6000)
libprotobuf.so.23 => not found
libzstd.so.1 => /lib/x86_64-linux-gnu/libzstd.so.1 (0x00007e4a38315000)
libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007e4a382f9000)
liblz4.so.1 => /lib/x86_64-linux-gnu/liblz4.so.1 (0x00007e4a382d9000)
libpostgres.so => /usr/local/cloudberry-db/lib/libpostgres.so (0x00007e4a36600000)
libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007e4a36200000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007e4a37f19000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007e4a382b7000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007e4a35e00000)
libxerces-c-3.2.so => /lib/x86_64-linux-gnu/libxerces-c-3.2.so (0x00007e4a35a00000)
libbz2.so.1.0 => /lib/x86_64-linux-gnu/libbz2.so.1.0 (0x00007e4a382a2000)
libxml2.so.2 => /lib/x86_64-linux-gnu/libxml2.so.2 (0x00007e4a3581e000)
libpam.so.0 => /lib/x86_64-linux-gnu/libpam.so.0 (0x00007e4a38290000)
libssl.so.3 => /lib/x86_64-linux-gnu/libssl.so.3 (0x00007e4a37e75000)
libcrypto.so.3 => /lib/x86_64-linux-gnu/libcrypto.so.3 (0x00007e4a35200000)
libgssapi_krb5.so.2 => /lib/x86_64-linux-gnu/libgssapi_krb5.so.2 (0x00007e4a3823c000)
libcurl-gnutls.so.4 => /lib/x86_64-linux-gnu/libcurl-gnutls.so.4 (0x00007e4a3655e000)
libldap-2.5.so.0 => /lib/x86_64-linux-gnu/libldap-2.5.so.0 (0x00007e4a364fe000)
/lib64/ld-linux-x86-64.so.2 (0x00007e4a383ea000)
libicuuc.so.70 => /lib/x86_64-linux-gnu/libicuuc.so.70 (0x00007e4a35005000)
liblzma.so.5 => /lib/x86_64-linux-gnu/liblzma.so.5 (0x00007e4a364d3000)
libaudit.so.1 => /lib/x86_64-linux-gnu/libaudit.so.1 (0x00007e4a364a5000)
libkrb5.so.3 => /lib/x86_64-linux-gnu/libkrb5.so.3 (0x00007e4a36135000)
libk5crypto.so.3 => /lib/x86_64-linux-gnu/libk5crypto.so.3 (0x00007e4a36476000)
libcom_err.so.2 => /lib/x86_64-linux-gnu/libcom_err.so.2 (0x00007e4a38232000)
libkrb5support.so.0 => /lib/x86_64-linux-gnu/libkrb5support.so.0 (0x00007e4a38224000)
libnghttp2.so.14 => /lib/x86_64-linux-gnu/libnghttp2.so.14 (0x00007e4a3644c000)
libidn2.so.0 => /lib/x86_64-linux-gnu/libidn2.so.0 (0x00007e4a36114000)
librtmp.so.1 => /lib/x86_64-linux-gnu/librtmp.so.1 (0x00007e4a3642d000)
libssh.so.4 => /lib/x86_64-linux-gnu/libssh.so.4 (0x00007e4a360a6000)
libpsl.so.5 => /lib/x86_64-linux-gnu/libpsl.so.5 (0x00007e4a36092000)
libnettle.so.8 => /lib/x86_64-linux-gnu/libnettle.so.8 (0x00007e4a3604c000)
libgnutls.so.30 => /lib/x86_64-linux-gnu/libgnutls.so.30 (0x00007e4a34e1a000)
liblber-2.5.so.0 => /lib/x86_64-linux-gnu/liblber-2.5.so.0 (0x00007e4a3603b000)
libbrotlidec.so.1 => /lib/x86_64-linux-gnu/libbrotlidec.so.1 (0x00007e4a3602d000)
libsasl2.so.2 => /lib/x86_64-linux-gnu/libsasl2.so.2 (0x00007e4a35de5000)
libicudata.so.70 => /lib/x86_64-linux-gnu/libicudata.so.70 (0x00007e4a33000000)
libcap-ng.so.0 => /lib/x86_64-linux-gnu/libcap-ng.so.0 (0x00007e4a35ddd000)
libkeyutils.so.1 => /lib/x86_64-linux-gnu/libkeyutils.so.1 (0x00007e4a35dd6000)
libresolv.so.2 => /lib/x86_64-linux-gnu/libresolv.so.2 (0x00007e4a35dc2000)
libunistring.so.2 => /lib/x86_64-linux-gnu/libunistring.so.2 (0x00007e4a35674000)
libhogweed.so.6 => /lib/x86_64-linux-gnu/libhogweed.so.6 (0x00007e4a35d7a000)
libgmp.so.10 => /lib/x86_64-linux-gnu/libgmp.so.10 (0x00007e4a34d98000)
libp11-kit.so.0 => /lib/x86_64-linux-gnu/libp11-kit.so.0 (0x00007e4a34c5d000)
libtasn1.so.6 => /lib/x86_64-linux-gnu/libtasn1.so.6 (0x00007e4a35d62000)
libbrotlicommon.so.1 => /lib/x86_64-linux-gnu/libbrotlicommon.so.1 (0x00007e4a35651000)
libffi.so.8 => /lib/x86_64-linux-gnu/libffi.so.8 (0x00007e4a35d55000)
ubuntu@cdw:~$
Although not entirely related, I also noticed the Ubuntu 22.04 test docker file installs cloudberry run-time dependencies. For the most part, test environments will rely on the deb file to pull in all necessary run-time packages.
As an example, the python3, rsync, iproute2 packages should be pulled in automatically as run time dependencies of the cloudberry deb file. Here is a snippet from devops/deploy/docker/test/ubuntu22.04/Dockerfile which reveals what I am referring to.
RUN apt-get update && \
apt-get install -y -qq \
htop \
bat \
silversearcher-ag \
vim \
wget \
git \
iproute2 \
iputils-ping \
lsof \
openssh-server \
pkg-config \
python3.10 \
python3-distutils \
python3-pip \
python3-setuptools \
rsync \
sudo \
tzdata && \
Again, reviewing the generated deb file installation process, I noticed the deb installation creates the gpadmin user. This account is not mandatory. It is up to the deployment team to determine the user account to be used to run the Cloudberry installation.
BTW, we need to set the new jobs as the required checks, including check-skip, build, and deb-install-test.
Here request your guys' review on this PR #1172. Once #1172 is merged, then can add deb-install-test to the .asf.yaml.
Fixed all the issues. Yep, there was a big mess in dependencies, I tried to fix it by comparing with rocky linux dependencies. Also I checked the installation on base ubuntu docker image, installed here debian package, created cluster and launched psql. Now packages description is:
xifos@xifos-dev-jammy:~$ dpkg -I apache-cloudberry-db-incubating_99.0.0-1-1.9cca25d3_amd64.deb
new Debian package, version 2.0.
size 20349128 bytes: control archive=62233 bytes.
1935 bytes, 32 lines control
249906 bytes, 2629 lines md5sums
143 bytes, 12 lines * postinst #!/bin/bash
218 bytes, 5 lines shlibs
72 bytes, 2 lines triggers
Package: apache-cloudberry-db-incubating
Version: 99.0.0-1-1.9cca25d3
Architecture: amd64
Maintainer: Apache Cloudberry (Incubating) <[email protected]>
Installed-Size: 76817
Depends: curl, cgroup-tools, iputils-ping, iproute2, keyutils, krb5-multidev, less, libapr1, libbz2-1.0, libcurl4, libcurl3-gnutls, libevent-2.1-7, libreadline8, libxml2, libyaml-0-2, libldap-2.5-0, libzstd1, libcgroup1, libssl3, libpam0g, libprotobuf23, libpsl5, libuv1, libxerces-c3.2, locales, lsof, lz4, net-tools, openssh-client, openssh-server, openssl, python3, rsync, wget, xz-utils, zlib1g
Provides: apache-cloudberry-db
Section: database
Description: Apache Cloudberry (incubating) is an advanced, open-source, massively
parallel processing (MPP) data warehouse developed from PostgreSQL and
Greenplum. It is designed for high-performance analytics on
large-scale data sets, offering powerful analytical capabilities and
enhanced security features.
Key Features:
- Massively parallel processing for optimized performance
- Advanced analytics for complex data processing
- Integration with ETL and BI tools
- Compatibility with multiple data sources and formats
- Enhanced security features
Apache Cloudberry supports both batch processing and real-time data
warehousing, making it a versatile solution for modern data
environments.
Apache Cloudberry (incubating) is an effort undergoing incubation at
the Apache Software Foundation (ASF), sponsored by the Apache
Incubator PMC.
Incubation is required of all newly accepted projects until a further
review indicates that the infrastructure, communications, and decision
making process have stabilized in a manner consistent with other
successful ASF projects.
While incubation status is not necessarily a reflection of the
completeness or stability of the code, it does indicate that the
project has yet to be fully endorsed by the ASF.
I also added tests. It turned out that:
- I had a some issue with finding *.so because deb needs special BUILD_DESTINATION which is used for find libraries in tests. So I created symbolic links, see the additional step "Prepare DEB Environment"
- For now, I have not enabled the whole set of tests. Let's start with contrib/gpcontrib/installcheck_good and then move on.
For example, I added and the removed resgroup tests because select check_cgroup_io_max(...) leads to coredump. I will create issue about it, but for now we need a working set of tests.
@tuhaihe @edespino Thank you for you review, we could continue here )
@tuhaihe I fixed .asf.yaml too, but see 4 pending checks - they marked as required. It looks strange, all tests are green, maybe https://github.com/apache/cloudberry/pull/1172 does not work as expected ...
@tuhaihe I fixed .asf.yaml too, but see 4 pending checks - they marked as required. It looks strange, all tests are green, maybe #1172 does not work as expected ...
Hi @leborchuk the #1172 has been reverted via #1414. Welcome to try to rebase your PR.
+1. Please update the checks' name in
.asf.yamlbefore merging.
Yep, rebased. Also, I deleted my changes in .asf.yaml completely. We need to make sure that my checks are stable, otherwise we will have another issue with reverting changes again. I will add them to .asf.yml later.
We can merge this pull request for now. If there are any questions, we can create new pull requests to address them. Otherwise, this pull request has been postponed for too long.
Thanks for your great work again! @leborchuk ❤️