arrow
arrow copied to clipboard
GH-38659: [CI][MATLAB][Packaging] Add MATLAB `packaging` task to crossbow `tasks.yml`
Rationale for this change
Per the following mailing list discussion:
https://lists.apache.org/thread/0xyow40h7b1bptsppb0rxd4g9r1xpmh6
to integrate the MATLAB interface code with the existing Arrow release tooling, we first need to add a task to the packaging group to crossbow. This packaging task will automatically create a MLTBX file (the MATLAB equivalent to a Python binary wheel or Ruby gem) that can be installed via a "one-click" workflow in MATLAB. This will enable MATLAB users to install the interface without needing to build from source.
What changes are included in this PR?
- Added a
matlabtask to thepackaginggroup indev/tasks/tasks.yml. - Added a new GitHub Actions workflow called
dev/tasks/matlab/github.ymlwhich builds the MATLAB interface code on all platforms (Windows, macOS, and Ubuntu 20.04) and packages the generated build artifacts into a single MLTBX file usingmatlab.addons.toolbox.packageToolbox. - Changed the GitHub-hosted runner to
ubuntu-20.04fromubuntu-latestfor the MATLAB CI check (i.e..github/workflows/matlab.yml). The rationale for this change is that we primarily develop and qualify against Debian 11 locally, but the CI check has been building againstubuntu-latest(i.e.ubuntu-22.04). There are two issues with usingubuntu-22.04. The first is that the version ofGLIBCshipped withubuntu-22.04is not fully compatible with the version ofGLIBCshipped withDebian 11. This results in a runtime linker error when qualifying the packaged MATLAB interface code locally on Debian 11. The second issue with usingubuntu-22.04is that the system version ofGLIBCXXis not fully compatible with the version ofGLIBCXXbundled with MATLAB R2023a (this is a relatively common issue - e.g. see: https://www.mathworks.com/matlabcentral/answers/1907290-how-to-manually-select-the-libstdc-library-to-use-to-resolve-a-version-glibcxx_-not-found). Previously, we worked around this issue in GitHub Actions by usingLD_PRELOADbefore starting up MATLAB to run the unit tests. On the other hand, the version ofGLIBCXXshipped withubuntu-20.04is binary compatible with the version bundled with MATLAB R2023a. Therefore, we believe it would be better to useubuntu-20.04in the MATLAB CI checks for the time being until we can qualify the MATLAB interface againstubuntu-22.04.
Are these changes tested?
Yes.
- Successfully submitted a crossbow
packagingjob for the MATLAB interface by commenting@github-actions crossbow submit matlab. Example of a successful packaging job: https://github.com/ursacomputing/crossbow/actions/runs/6893506432/job/18753227453. - Manually installed the resulting MLTBX file on macOS, Windows, Debian 11, and Ubuntu 20.04. Ran all tests under
matlab/testusingruntests . IncludeSubFolders 1.
Are there any user-facing changes?
No.
Notes
- While qualifying, we discovered that MATLAB's programmatic packaging interface does not properly include symbolic link files in the packaged MLTBX file. We've reported this bug to the relevant MathWorks development team. As a temporary workaround, we included a step to change the expected name of the Arrow C++ libraries (using
patchelf/install_name_tool) whichlibarrowproxy.so/libarrowproxy.dylibdepends on tolibarrow.so.1500.0.0/libarrow.1500.0.0.dylibinstead oflibarrow.so.1500/libarrow.1500.dylib, respectively. Once this bug is resolved, we will remove this step from the workflow.
Future Directions
- Add tooling to upload release candidate (RC) MLTBX files to apache/arrow's GitHub Releases area and mark them as "Prerelease". In other words, modify https://github.com/apache/arrow/blob/main/dev/release/05-binary-upload.sh.
- Add a post-release script to upload release MLTBX files to apache/arrow's GitHub Releases area (similar to how https://github.com/apache/arrow/blob/main/dev/release/post-09-python.sh works).
- Enable nightly builds for the MATLAB interface.
- Document how to qualify a MATLAB Arrow interface release.
- Enable building and testing the MATLAB Arrow interface on multiple Ubuntu distributions simulatneously (e.g. 22.04 and 22.04).
- Closes: #38659
@github-actions crossbow submit matlab
Revision: 54f771586f27e8c768e491c60c2546a463b8d122
Submitted crossbow builds: ursacomputing/crossbow @ actions-e85d018cce
| Task | Status |
|---|---|
| matlab |
@github-actions crossbow submit matlab
Revision: e7e73375502702e664b5630ce8278c76e06f3372
Submitted crossbow builds: ursacomputing/crossbow @ actions-e9322b54d7
| Task | Status |
|---|---|
| matlab |
@github-actions crossbow submit matlab
Revision: 698385f54ea88ddc4c1ca755af5af77fe0673577
Submitted crossbow builds: ursacomputing/crossbow @ actions-0a6cd1c971
| Task | Status |
|---|---|
| matlab |
@github-actions crossbow submit matlab
Revision: 02253b78f98ebfb10c9863f786d8c4637e18938d
Submitted crossbow builds: ursacomputing/crossbow @ actions-db9baeddea
| Task | Status |
|---|---|
| matlab |
@github-actions crossbow submit matlab
Revision: 45db0635c762b4ad12d698abfe028a34fccc1fef
Submitted crossbow builds: ursacomputing/crossbow @ actions-08f7c2e125
| Task | Status |
|---|---|
| matlab |
@github-actions crossbow submit matlab
Revision: 481bc0c8a8f077cabe414845a2ccfe2a61a676b2
Submitted crossbow builds: ursacomputing/crossbow @ actions-6b603a1bef
| Task | Status |
|---|---|
| matlab |
@github-actions crossbow submit matlab
Revision: 6f8246dceb50ec388425e1eaeb743b5c42d8effd
Submitted crossbow builds: ursacomputing/crossbow @ actions-99c4ca0454
| Task | Status |
|---|---|
| matlab |
@github-actions crossbow submit matlab
Revision: b100c5f73a5fd8bdbde94717df8edcdc6656b12f
Submitted crossbow builds: ursacomputing/crossbow @ actions-27f3584761
| Task | Status |
|---|---|
| matlab |
@github-actions crossbow submit matlab
Revision: 4f7365f3ed3d2bfed3310536a6e5ae96b708c5e0
Submitted crossbow builds: ursacomputing/crossbow @ actions-94bb08d1ec
| Task | Status |
|---|---|
| matlab |
@github-actions crossbow submit matlab
Revision: 2cff4db0d37cb344383c049cac1a12c8a6f7d634
Submitted crossbow builds: ursacomputing/crossbow @ actions-f0fa84c1be
| Task | Status |
|---|---|
| matlab |
@github-actions crossbow submit matlab
Revision: 35598744b784da8aa014bedd83446beccb47b51e
Submitted crossbow builds: ursacomputing/crossbow @ actions-a06dc1513b
| Task | Status |
|---|---|
| matlab |
Wow what a cool feature:
[!WARNING]
I didn't know about this :D
Let me know if I can help :rocket:
Thank you so much for offering to help @assignUser! :) We will definitely reach out to you when needed. We think we are getting pretty close to something that mostly works... but we might run into unexpected issues.
Here is a brief summary of the current state:
- We have a MATLAB script (
packageMatlabInterface.m) that automatically packages an MLTBX file from a MATLAB source directory. This can be called from a GitHub Actions workflow. - We've added a
crossbowtask namedmatlab(name TBD) which builds the MATLAB interface code on Windows, macOS, and Linux, compresses all the build artifacts, uploads to the GitHub Artifacts area, downloads them, and then packages them into an MLTBX file usingpackageMatlabInterface.m. - The last step should involve this MTLBX file being uploaded to JFrog Artifactory using
macros.github_upload_releases. - We've also started working on a post-release script
16-post-matlab.sh, but haven't been able to fully test it yet because we haven't been able to successfully upload artifacts to JFrog Artifactory yet (our working assumption has been that we won't have the permissions to upload to Artifactory - but, we wanted to try to at least get to the point where that would be happening in the CI workflow first before asking others for help).
Hopefully this makes some sense.
@github-actions crossbow submit matlab
Revision: fa6d1b2ff6b854fd2aefbd3e808d09c3b1171116
Submitted crossbow builds: ursacomputing/crossbow @ actions-0d4aa5a62e
| Task | Status |
|---|---|
| matlab |
macros.github_upload_releases this uploads the artifacts to the crossbow repo release area, not the jfrog artifactory. From there we can use archery to download them and upload them to the artifactory.
IIRC every committer can upload things to the artifactory, either via the webgui or via curl (? check the other release scripts). We would only upload things there for releases/rcs (for testing now would also be fine). If you want to provide a 'nightly' version of the mtlbx files you can checkout the way we upload nightlies for r and java to https://nightlies.apache.org in .github/workflows/r|java_nightly.yml
We could of course also upload the nightly builds to the mathworks/arrow repo directly from the nightly job by adding a fine grained PAT with the necessary permission to the crossbow repo. But I don't know how the matlab ecosystem handles dev/nightly versions.
FYI: If we want to provide nightly packages, we should follow the following ASF requirement: https://www.apache.org/legal/release-policy.html#publication
Projects SHALL publish official releases and SHALL NOT publish unreleased materials outside the development community.
During the process of developing software and preparing a release, various packages are made available to the development community for testing purposes. Projects MUST direct outsiders towards official releases rather than raw source repositories, nightly builds, snapshots, release candidates, or any other similar packages. Projects SHOULD make available developer resources to support individuals actively participating in development or following the dev list and thus aware of the conditions placed on unreleased materials.
macros.github_upload_releases this uploads the artifacts to the crossbow repo release area, not the jfrog artifactory. From there we can use archery to download them and upload them to the artifactory.
Ah, sorry for the confusion! That makes more sense. We didn't realize the artifacts were getting released via GitHub Releases in ursacomputing/crossbow first.
IIRC every committer can upload things to the artifactory, either via the webgui or via curl (? check the other release scripts). We would only upload things there for releases/rcs (for testing now would also be fine). If you want to provide a 'nightly' version of the mtlbx files you can checkout the way we upload nightlies for r and java to https://nightlies.apache.org/ in .github/workflows/r|java_nightly.yml
It's good to hear that committers can upload to artifactory.
We are certainly interested in "nightly" builds in the long term, so thank you for pointing us in the direction of the R and Java nightly workflows for reference! This is really useful.
FYI: If we want to provide nightly packages, we should follow the following ASF requirement: https://www.apache.org/legal/release-policy.html#publication
Thanks for pointing this out @kou. We will make sure we account for these ASF policies whenever we work on the "nightly" builds and not distribute the nightly builds.
Our interpretation of this is that we should not distribute "nightly" builds of the MATLAB interface through mathworks/arrow or other non-ASF operated infrastructure. Rather, we should distribute them via https://nightlies.apache.org. Is our understanding correct?