cloudberry icon indicating copy to clipboard operation
cloudberry copied to clipboard

ci: add debian package Apache Cloudberry (incubating) build and test workflow

Open leborchuk opened this issue 2 months ago • 11 comments

Introduces GitHub Actions workflow for Apache Cloudberry (incubating) build debian package:

In https://github.com/apache/cloudberry/pull/1359 I have added docker build and test containers for ubuntu 22.04. Here I add CI workflow to check if we still could create and install debian package for ubuntu 22.04.

To do this:

  • use devops package for build cloudberry binary inside docker build container
  • create debian package for builded binary
  • upload artifacts to github actions
  • download artifacts to ubuntu test container
  • install downloaded debian packages
  • check packages for integrity

leborchuk avatar Oct 06 '25 12:10 leborchuk

@tuhaihe Here I want to add test whether debian package builds successfully, please review.

leborchuk avatar Oct 07 '25 19:10 leborchuk

Cool, thanks! Will work with @edespino on this.

tuhaihe avatar Oct 08 '25 01:10 tuhaihe

You can see the Depends field lists python 2 artifact dependencies:

cbadmin@cdw:~$ dpkg -I apache-cloudberry-db-incubating_99.0.0-1-1.17dfe844_amd64.deb 
 new Debian package, version 2.0.
 size 20350386 bytes: control archive=62494 bytes.
    1899 bytes,    32 lines      control              
  249906 bytes,  2629 lines      md5sums              
     143 bytes,    12 lines   *  postinst             #!/bin/bash
     201 bytes,    13 lines   *  preinst              #!/bin/bash
     218 bytes,     5 lines      shlibs               
      72 bytes,     2 lines      triggers             
 Package: apache-cloudberry-db-incubating
 Version: 99.0.0-1-1.17dfe844
 Architecture: amd64
 Maintainer: Apache Cloudberry (Incubating) <[email protected]>
 Installed-Size: 76800
 Depends: curl, cgroup-tools, iputils-ping, krb5-multidev, less, libapr1, libbz2-1.0, libcurl4, libcurl3-gnutls, libevent-2.1-7, libreadline8, libxml2, libyaml-0-2, lib
ldap-2.5-0, libzstd1, libcgroup1, libssl3, libpam0g, libxerces-c3.2, locales, net-tools, openssh-client, openssh-server, openssl, python-six, python2.7, python2.7-dev, 
rsync, wget, zlib1g, libuv1
 Provides: apache-cloudberry-db
 Section: database
 Description: Apache Cloudberry (incubating) is an advanced, open-source, massively
   parallel processing (MPP) data warehouse developed from PostgreSQL and
   Greenplum. It is designed for high-performance analytics on
   large-scale data sets, offering powerful analytical capabilities and
   enhanced security features.
   Key Features:
     - Massively parallel processing for optimized performance
     - Advanced analytics for complex data processing
     - Integration with ETL and BI tools
     - Compatibility with multiple data sources and formats
     - Enhanced security features
   Apache Cloudberry supports both batch processing and real-time data
   warehousing, making it a versatile solution for modern data
   environments.
   Apache Cloudberry (incubating) is an effort undergoing incubation at
   the Apache Software Foundation (ASF), sponsored by the Apache
   Incubator PMC.
   Incubation is required of all newly accepted projects until a further
   review indicates that the infrastructure, communications, and decision
   making process have stabilized in a manner consistent with other
   successful ASF projects.
   While incubation status is not necessarily a reflection of the
   completeness or stability of the code, it does indicate that the
   project has yet to be fully endorsed by the ASF.
cbadmin@cdw:~$ 

edespino avatar Oct 08 '25 04:10 edespino

On a relatively bare/clean Ubuntu 22.04 system, I installed Cloudberry from the deb file generated from the build. Running gpinitsystem failed as there is a runtime dependency (libprotofuf.so.23) which is not found.

ubuntu@cdw:~$ source /usr/cloudberry-db/cloudberry-env.sh 
ubuntu@cdw:~$ ldd  /usr/cloudberry-db/lib/postgresql/pax.so 
        linux-vdso.so.1 (0x00007ffe559b6000)
        libprotobuf.so.23 => not found
        libzstd.so.1 => /lib/x86_64-linux-gnu/libzstd.so.1 (0x00007e4a38315000)
        libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007e4a382f9000)
        liblz4.so.1 => /lib/x86_64-linux-gnu/liblz4.so.1 (0x00007e4a382d9000)
        libpostgres.so => /usr/local/cloudberry-db/lib/libpostgres.so (0x00007e4a36600000)
        libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007e4a36200000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007e4a37f19000)
        libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007e4a382b7000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007e4a35e00000)
        libxerces-c-3.2.so => /lib/x86_64-linux-gnu/libxerces-c-3.2.so (0x00007e4a35a00000)
        libbz2.so.1.0 => /lib/x86_64-linux-gnu/libbz2.so.1.0 (0x00007e4a382a2000)
        libxml2.so.2 => /lib/x86_64-linux-gnu/libxml2.so.2 (0x00007e4a3581e000)
        libpam.so.0 => /lib/x86_64-linux-gnu/libpam.so.0 (0x00007e4a38290000)
        libssl.so.3 => /lib/x86_64-linux-gnu/libssl.so.3 (0x00007e4a37e75000)
        libcrypto.so.3 => /lib/x86_64-linux-gnu/libcrypto.so.3 (0x00007e4a35200000)
        libgssapi_krb5.so.2 => /lib/x86_64-linux-gnu/libgssapi_krb5.so.2 (0x00007e4a3823c000)
        libcurl-gnutls.so.4 => /lib/x86_64-linux-gnu/libcurl-gnutls.so.4 (0x00007e4a3655e000)
        libldap-2.5.so.0 => /lib/x86_64-linux-gnu/libldap-2.5.so.0 (0x00007e4a364fe000)
        /lib64/ld-linux-x86-64.so.2 (0x00007e4a383ea000)
        libicuuc.so.70 => /lib/x86_64-linux-gnu/libicuuc.so.70 (0x00007e4a35005000)
        liblzma.so.5 => /lib/x86_64-linux-gnu/liblzma.so.5 (0x00007e4a364d3000)
        libaudit.so.1 => /lib/x86_64-linux-gnu/libaudit.so.1 (0x00007e4a364a5000)
        libkrb5.so.3 => /lib/x86_64-linux-gnu/libkrb5.so.3 (0x00007e4a36135000)
        libk5crypto.so.3 => /lib/x86_64-linux-gnu/libk5crypto.so.3 (0x00007e4a36476000)
        libcom_err.so.2 => /lib/x86_64-linux-gnu/libcom_err.so.2 (0x00007e4a38232000)
        libkrb5support.so.0 => /lib/x86_64-linux-gnu/libkrb5support.so.0 (0x00007e4a38224000)
        libnghttp2.so.14 => /lib/x86_64-linux-gnu/libnghttp2.so.14 (0x00007e4a3644c000)
        libidn2.so.0 => /lib/x86_64-linux-gnu/libidn2.so.0 (0x00007e4a36114000)
        librtmp.so.1 => /lib/x86_64-linux-gnu/librtmp.so.1 (0x00007e4a3642d000)
        libssh.so.4 => /lib/x86_64-linux-gnu/libssh.so.4 (0x00007e4a360a6000)
        libpsl.so.5 => /lib/x86_64-linux-gnu/libpsl.so.5 (0x00007e4a36092000)
        libnettle.so.8 => /lib/x86_64-linux-gnu/libnettle.so.8 (0x00007e4a3604c000)
        libgnutls.so.30 => /lib/x86_64-linux-gnu/libgnutls.so.30 (0x00007e4a34e1a000)
        liblber-2.5.so.0 => /lib/x86_64-linux-gnu/liblber-2.5.so.0 (0x00007e4a3603b000)
        libbrotlidec.so.1 => /lib/x86_64-linux-gnu/libbrotlidec.so.1 (0x00007e4a3602d000)
        libsasl2.so.2 => /lib/x86_64-linux-gnu/libsasl2.so.2 (0x00007e4a35de5000)
        libicudata.so.70 => /lib/x86_64-linux-gnu/libicudata.so.70 (0x00007e4a33000000)
        libcap-ng.so.0 => /lib/x86_64-linux-gnu/libcap-ng.so.0 (0x00007e4a35ddd000)
        libkeyutils.so.1 => /lib/x86_64-linux-gnu/libkeyutils.so.1 (0x00007e4a35dd6000)
        libresolv.so.2 => /lib/x86_64-linux-gnu/libresolv.so.2 (0x00007e4a35dc2000)
        libunistring.so.2 => /lib/x86_64-linux-gnu/libunistring.so.2 (0x00007e4a35674000)
        libhogweed.so.6 => /lib/x86_64-linux-gnu/libhogweed.so.6 (0x00007e4a35d7a000)
        libgmp.so.10 => /lib/x86_64-linux-gnu/libgmp.so.10 (0x00007e4a34d98000)
        libp11-kit.so.0 => /lib/x86_64-linux-gnu/libp11-kit.so.0 (0x00007e4a34c5d000)
        libtasn1.so.6 => /lib/x86_64-linux-gnu/libtasn1.so.6 (0x00007e4a35d62000)
        libbrotlicommon.so.1 => /lib/x86_64-linux-gnu/libbrotlicommon.so.1 (0x00007e4a35651000)
        libffi.so.8 => /lib/x86_64-linux-gnu/libffi.so.8 (0x00007e4a35d55000)
ubuntu@cdw:~$ 

edespino avatar Oct 08 '25 05:10 edespino

Although not entirely related, I also noticed the Ubuntu 22.04 test docker file installs cloudberry run-time dependencies. For the most part, test environments will rely on the deb file to pull in all necessary run-time packages.

As an example, the python3, rsync, iproute2 packages should be pulled in automatically as run time dependencies of the cloudberry deb file. Here is a snippet from devops/deploy/docker/test/ubuntu22.04/Dockerfile which reveals what I am referring to.

RUN apt-get update && \
    apt-get install -y -qq \
            htop \
            bat \
            silversearcher-ag \
            vim \
            wget \
            git \
            iproute2 \
            iputils-ping \
            lsof \
            openssh-server \
            pkg-config \
            python3.10 \
            python3-distutils \
            python3-pip \
            python3-setuptools \
            rsync \
            sudo \
            tzdata && \

edespino avatar Oct 08 '25 05:10 edespino

Again, reviewing the generated deb file installation process, I noticed the deb installation creates the gpadmin user. This account is not mandatory. It is up to the deployment team to determine the user account to be used to run the Cloudberry installation.

edespino avatar Oct 08 '25 05:10 edespino

BTW, we need to set the new jobs as the required checks, including check-skip, build, and deb-install-test.

Here request your guys' review on this PR #1172. Once #1172 is merged, then can add deb-install-test to the .asf.yaml.

tuhaihe avatar Oct 09 '25 03:10 tuhaihe

Fixed all the issues. Yep, there was a big mess in dependencies, I tried to fix it by comparing with rocky linux dependencies. Also I checked the installation on base ubuntu docker image, installed here debian package, created cluster and launched psql. Now packages description is:

xifos@xifos-dev-jammy:~$ dpkg -I apache-cloudberry-db-incubating_99.0.0-1-1.9cca25d3_amd64.deb
 new Debian package, version 2.0.
 size 20349128 bytes: control archive=62233 bytes.
    1935 bytes,    32 lines      control
  249906 bytes,  2629 lines      md5sums
     143 bytes,    12 lines   *  postinst             #!/bin/bash
     218 bytes,     5 lines      shlibs
      72 bytes,     2 lines      triggers
 Package: apache-cloudberry-db-incubating
 Version: 99.0.0-1-1.9cca25d3
 Architecture: amd64
 Maintainer: Apache Cloudberry (Incubating) <[email protected]>
 Installed-Size: 76817
 Depends: curl, cgroup-tools, iputils-ping, iproute2, keyutils, krb5-multidev, less, libapr1, libbz2-1.0, libcurl4, libcurl3-gnutls, libevent-2.1-7, libreadline8, libxml2, libyaml-0-2, libldap-2.5-0, libzstd1, libcgroup1, libssl3, libpam0g, libprotobuf23, libpsl5, libuv1, libxerces-c3.2, locales, lsof, lz4, net-tools, openssh-client, openssh-server, openssl, python3, rsync, wget, xz-utils, zlib1g
 Provides: apache-cloudberry-db
 Section: database
 Description: Apache Cloudberry (incubating) is an advanced, open-source, massively
   parallel processing (MPP) data warehouse developed from PostgreSQL and
   Greenplum. It is designed for high-performance analytics on
   large-scale data sets, offering powerful analytical capabilities and
   enhanced security features.
   Key Features:
     - Massively parallel processing for optimized performance
     - Advanced analytics for complex data processing
     - Integration with ETL and BI tools
     - Compatibility with multiple data sources and formats
     - Enhanced security features
   Apache Cloudberry supports both batch processing and real-time data
   warehousing, making it a versatile solution for modern data
   environments.
   Apache Cloudberry (incubating) is an effort undergoing incubation at
   the Apache Software Foundation (ASF), sponsored by the Apache
   Incubator PMC.
   Incubation is required of all newly accepted projects until a further
   review indicates that the infrastructure, communications, and decision
   making process have stabilized in a manner consistent with other
   successful ASF projects.
   While incubation status is not necessarily a reflection of the
   completeness or stability of the code, it does indicate that the
   project has yet to be fully endorsed by the ASF.

I also added tests. It turned out that:

  1. I had a some issue with finding *.so because deb needs special BUILD_DESTINATION which is used for find libraries in tests. So I created symbolic links, see the additional step "Prepare DEB Environment"
  2. For now, I have not enabled the whole set of tests. Let's start with contrib/gpcontrib/installcheck_good and then move on.

For example, I added and the removed resgroup tests because select check_cgroup_io_max(...) leads to coredump. I will create issue about it, but for now we need a working set of tests.

@tuhaihe @edespino Thank you for you review, we could continue here )

leborchuk avatar Oct 24 '25 07:10 leborchuk

@tuhaihe I fixed .asf.yaml too, but see 4 pending checks - they marked as required. It looks strange, all tests are green, maybe https://github.com/apache/cloudberry/pull/1172 does not work as expected ...

leborchuk avatar Oct 24 '25 07:10 leborchuk

@tuhaihe I fixed .asf.yaml too, but see 4 pending checks - they marked as required. It looks strange, all tests are green, maybe #1172 does not work as expected ...

Hi @leborchuk the #1172 has been reverted via #1414. Welcome to try to rebase your PR.

tuhaihe avatar Oct 24 '25 08:10 tuhaihe

+1. Please update the checks' name in .asf.yaml before merging.

Yep, rebased. Also, I deleted my changes in .asf.yaml completely. We need to make sure that my checks are stable, otherwise we will have another issue with reverting changes again. I will add them to .asf.yml later.

leborchuk avatar Oct 24 '25 12:10 leborchuk

We can merge this pull request for now. If there are any questions, we can create new pull requests to address them. Otherwise, this pull request has been postponed for too long.

Thanks for your great work again! @leborchuk ❤️

tuhaihe avatar Nov 21 '25 09:11 tuhaihe