arrow icon indicating copy to clipboard operation
arrow copied to clipboard

GH-38659: [CI][MATLAB][Packaging] Add MATLAB `packaging` task to crossbow `tasks.yml`

Open kevingurney opened this issue 2 years ago • 76 comments
trafficstars

Rationale for this change

Per the following mailing list discussion:

https://lists.apache.org/thread/0xyow40h7b1bptsppb0rxd4g9r1xpmh6

to integrate the MATLAB interface code with the existing Arrow release tooling, we first need to add a task to the packaging group to crossbow. This packaging task will automatically create a MLTBX file (the MATLAB equivalent to a Python binary wheel or Ruby gem) that can be installed via a "one-click" workflow in MATLAB. This will enable MATLAB users to install the interface without needing to build from source.

What changes are included in this PR?

  1. Added a matlab task to the packaging group in dev/tasks/tasks.yml.
  2. Added a new GitHub Actions workflow called dev/tasks/matlab/github.yml which builds the MATLAB interface code on all platforms (Windows, macOS, and Ubuntu 20.04) and packages the generated build artifacts into a single MLTBX file using matlab.addons.toolbox.packageToolbox.
  3. Changed the GitHub-hosted runner to ubuntu-20.04 from ubuntu-latest for the MATLAB CI check (i.e. .github/workflows/matlab.yml). The rationale for this change is that we primarily develop and qualify against Debian 11 locally, but the CI check has been building against ubuntu-latest (i.e. ubuntu-22.04). There are two issues with using ubuntu-22.04. The first is that the version of GLIBC shipped with ubuntu-22.04 is not fully compatible with the version of GLIBC shipped with Debian 11. This results in a runtime linker error when qualifying the packaged MATLAB interface code locally on Debian 11. The second issue with using ubuntu-22.04 is that the system version of GLIBCXX is not fully compatible with the version of GLIBCXX bundled with MATLAB R2023a (this is a relatively common issue - e.g. see: https://www.mathworks.com/matlabcentral/answers/1907290-how-to-manually-select-the-libstdc-library-to-use-to-resolve-a-version-glibcxx_-not-found). Previously, we worked around this issue in GitHub Actions by using LD_PRELOAD before starting up MATLAB to run the unit tests. On the other hand, the version of GLIBCXX shipped with ubuntu-20.04 is binary compatible with the version bundled with MATLAB R2023a. Therefore, we believe it would be better to use ubuntu-20.04 in the MATLAB CI checks for the time being until we can qualify the MATLAB interface against ubuntu-22.04.

Are these changes tested?

Yes.

  1. Successfully submitted a crossbow packaging job for the MATLAB interface by commenting @github-actions crossbow submit matlab. Example of a successful packaging job: https://github.com/ursacomputing/crossbow/actions/runs/6893506432/job/18753227453.
  2. Manually installed the resulting MLTBX file on macOS, Windows, Debian 11, and Ubuntu 20.04. Ran all tests under matlab/test using runtests . IncludeSubFolders 1.

Are there any user-facing changes?

No.

Notes

  1. While qualifying, we discovered that MATLAB's programmatic packaging interface does not properly include symbolic link files in the packaged MLTBX file. We've reported this bug to the relevant MathWorks development team. As a temporary workaround, we included a step to change the expected name of the Arrow C++ libraries (using patchelf/install_name_tool) which libarrowproxy.so/libarrowproxy.dylib depends on to libarrow.so.1500.0.0/libarrow.1500.0.0.dylib instead of libarrow.so.1500/libarrow.1500.dylib, respectively. Once this bug is resolved, we will remove this step from the workflow.

Future Directions

  1. Add tooling to upload release candidate (RC) MLTBX files to apache/arrow's GitHub Releases area and mark them as "Prerelease". In other words, modify https://github.com/apache/arrow/blob/main/dev/release/05-binary-upload.sh.
  2. Add a post-release script to upload release MLTBX files to apache/arrow's GitHub Releases area (similar to how https://github.com/apache/arrow/blob/main/dev/release/post-09-python.sh works).
  3. Enable nightly builds for the MATLAB interface.
  4. Document how to qualify a MATLAB Arrow interface release.
  5. Enable building and testing the MATLAB Arrow interface on multiple Ubuntu distributions simulatneously (e.g. 22.04 and 22.04).
  • Closes: #38659

kevingurney avatar Nov 09 '23 20:11 kevingurney

@github-actions crossbow submit matlab

kevingurney avatar Nov 09 '23 20:11 kevingurney

Revision: 54f771586f27e8c768e491c60c2546a463b8d122

Submitted crossbow builds: ursacomputing/crossbow @ actions-e85d018cce

Task Status
matlab Github Actions

github-actions[bot] avatar Nov 09 '23 20:11 github-actions[bot]

@github-actions crossbow submit matlab

kevingurney avatar Nov 09 '23 20:11 kevingurney

Revision: e7e73375502702e664b5630ce8278c76e06f3372

Submitted crossbow builds: ursacomputing/crossbow @ actions-e9322b54d7

Task Status
matlab Github Actions

github-actions[bot] avatar Nov 09 '23 20:11 github-actions[bot]

@github-actions crossbow submit matlab

kevingurney avatar Nov 09 '23 20:11 kevingurney

Revision: 698385f54ea88ddc4c1ca755af5af77fe0673577

Submitted crossbow builds: ursacomputing/crossbow @ actions-0a6cd1c971

Task Status
matlab Github Actions

github-actions[bot] avatar Nov 09 '23 20:11 github-actions[bot]

@github-actions crossbow submit matlab

kevingurney avatar Nov 09 '23 20:11 kevingurney

Revision: 02253b78f98ebfb10c9863f786d8c4637e18938d

Submitted crossbow builds: ursacomputing/crossbow @ actions-db9baeddea

Task Status
matlab Github Actions

github-actions[bot] avatar Nov 09 '23 20:11 github-actions[bot]

@github-actions crossbow submit matlab

kevingurney avatar Nov 09 '23 20:11 kevingurney

Revision: 45db0635c762b4ad12d698abfe028a34fccc1fef

Submitted crossbow builds: ursacomputing/crossbow @ actions-08f7c2e125

Task Status
matlab Github Actions

github-actions[bot] avatar Nov 09 '23 20:11 github-actions[bot]

@github-actions crossbow submit matlab

kevingurney avatar Nov 09 '23 20:11 kevingurney

Revision: 481bc0c8a8f077cabe414845a2ccfe2a61a676b2

Submitted crossbow builds: ursacomputing/crossbow @ actions-6b603a1bef

Task Status
matlab Github Actions

github-actions[bot] avatar Nov 09 '23 20:11 github-actions[bot]

@github-actions crossbow submit matlab

kevingurney avatar Nov 09 '23 20:11 kevingurney

Revision: 6f8246dceb50ec388425e1eaeb743b5c42d8effd

Submitted crossbow builds: ursacomputing/crossbow @ actions-99c4ca0454

Task Status
matlab Github Actions

github-actions[bot] avatar Nov 09 '23 20:11 github-actions[bot]

@github-actions crossbow submit matlab

kevingurney avatar Nov 09 '23 20:11 kevingurney

Revision: b100c5f73a5fd8bdbde94717df8edcdc6656b12f

Submitted crossbow builds: ursacomputing/crossbow @ actions-27f3584761

Task Status
matlab Github Actions

github-actions[bot] avatar Nov 09 '23 20:11 github-actions[bot]

@github-actions crossbow submit matlab

kevingurney avatar Nov 09 '23 21:11 kevingurney

Revision: 4f7365f3ed3d2bfed3310536a6e5ae96b708c5e0

Submitted crossbow builds: ursacomputing/crossbow @ actions-94bb08d1ec

Task Status
matlab Github Actions

github-actions[bot] avatar Nov 09 '23 21:11 github-actions[bot]

@github-actions crossbow submit matlab

kevingurney avatar Nov 09 '23 21:11 kevingurney

Revision: 2cff4db0d37cb344383c049cac1a12c8a6f7d634

Submitted crossbow builds: ursacomputing/crossbow @ actions-f0fa84c1be

Task Status
matlab Github Actions

github-actions[bot] avatar Nov 09 '23 21:11 github-actions[bot]

@github-actions crossbow submit matlab

kevingurney avatar Nov 09 '23 21:11 kevingurney

Revision: 35598744b784da8aa014bedd83446beccb47b51e

Submitted crossbow builds: ursacomputing/crossbow @ actions-a06dc1513b

Task Status
matlab Github Actions

github-actions[bot] avatar Nov 09 '23 21:11 github-actions[bot]

Wow what a cool feature:

[!WARNING]
I didn't know about this :D

Let me know if I can help :rocket:

assignUser avatar Nov 09 '23 22:11 assignUser

Thank you so much for offering to help @assignUser! :) We will definitely reach out to you when needed. We think we are getting pretty close to something that mostly works... but we might run into unexpected issues.

Here is a brief summary of the current state:

  1. We have a MATLAB script (packageMatlabInterface.m) that automatically packages an MLTBX file from a MATLAB source directory. This can be called from a GitHub Actions workflow.
  2. We've added a crossbow task named matlab (name TBD) which builds the MATLAB interface code on Windows, macOS, and Linux, compresses all the build artifacts, uploads to the GitHub Artifacts area, downloads them, and then packages them into an MLTBX file using packageMatlabInterface.m.
  3. The last step should involve this MTLBX file being uploaded to JFrog Artifactory using macros.github_upload_releases.
  4. We've also started working on a post-release script 16-post-matlab.sh, but haven't been able to fully test it yet because we haven't been able to successfully upload artifacts to JFrog Artifactory yet (our working assumption has been that we won't have the permissions to upload to Artifactory - but, we wanted to try to at least get to the point where that would be happening in the CI workflow first before asking others for help).

Hopefully this makes some sense.

kevingurney avatar Nov 09 '23 22:11 kevingurney

@github-actions crossbow submit matlab

kevingurney avatar Nov 09 '23 22:11 kevingurney

Revision: fa6d1b2ff6b854fd2aefbd3e808d09c3b1171116

Submitted crossbow builds: ursacomputing/crossbow @ actions-0d4aa5a62e

Task Status
matlab Github Actions

github-actions[bot] avatar Nov 09 '23 22:11 github-actions[bot]

macros.github_upload_releases this uploads the artifacts to the crossbow repo release area, not the jfrog artifactory. From there we can use archery to download them and upload them to the artifactory.

IIRC every committer can upload things to the artifactory, either via the webgui or via curl (? check the other release scripts). We would only upload things there for releases/rcs (for testing now would also be fine). If you want to provide a 'nightly' version of the mtlbx files you can checkout the way we upload nightlies for r and java to https://nightlies.apache.org in .github/workflows/r|java_nightly.yml

assignUser avatar Nov 09 '23 22:11 assignUser

We could of course also upload the nightly builds to the mathworks/arrow repo directly from the nightly job by adding a fine grained PAT with the necessary permission to the crossbow repo. But I don't know how the matlab ecosystem handles dev/nightly versions.

assignUser avatar Nov 09 '23 22:11 assignUser

FYI: If we want to provide nightly packages, we should follow the following ASF requirement: https://www.apache.org/legal/release-policy.html#publication

Projects SHALL publish official releases and SHALL NOT publish unreleased materials outside the development community.

During the process of developing software and preparing a release, various packages are made available to the development community for testing purposes. Projects MUST direct outsiders towards official releases rather than raw source repositories, nightly builds, snapshots, release candidates, or any other similar packages. Projects SHOULD make available developer resources to support individuals actively participating in development or following the dev list and thus aware of the conditions placed on unreleased materials.

kou avatar Nov 10 '23 01:11 kou

macros.github_upload_releases this uploads the artifacts to the crossbow repo release area, not the jfrog artifactory. From there we can use archery to download them and upload them to the artifactory.

Ah, sorry for the confusion! That makes more sense. We didn't realize the artifacts were getting released via GitHub Releases in ursacomputing/crossbow first.

IIRC every committer can upload things to the artifactory, either via the webgui or via curl (? check the other release scripts). We would only upload things there for releases/rcs (for testing now would also be fine). If you want to provide a 'nightly' version of the mtlbx files you can checkout the way we upload nightlies for r and java to https://nightlies.apache.org/ in .github/workflows/r|java_nightly.yml

It's good to hear that committers can upload to artifactory.

We are certainly interested in "nightly" builds in the long term, so thank you for pointing us in the direction of the R and Java nightly workflows for reference! This is really useful.

FYI: If we want to provide nightly packages, we should follow the following ASF requirement: https://www.apache.org/legal/release-policy.html#publication

Thanks for pointing this out @kou. We will make sure we account for these ASF policies whenever we work on the "nightly" builds and not distribute the nightly builds.

Our interpretation of this is that we should not distribute "nightly" builds of the MATLAB interface through mathworks/arrow or other non-ASF operated infrastructure. Rather, we should distribute them via https://nightlies.apache.org. Is our understanding correct?

kevingurney avatar Nov 10 '23 17:11 kevingurney