ARROW-15691: [Dev] Update archery to work with either master or main as default branch
Overview
The goal of this pull request is to update archery to work with a repository default branch named master or main, as part of the effort to rename the Apache Arrow repository's default branch to main. The parent Jira ticket can be found here.
Implementation
- Update the language of the top level
archery,crossbow, anddockercommand line interface code to reference the mainline development branch (default git branch) generically. - Update comments that reference the
masterbranch. - Update the
dev/archery/archery/docker/testsfiles that referencedaskandpandasrepositories' default branches. Both repositories currently usemainas the default branch. - Update the
crossbowbenchmarking examples to generically specify the<default-branch>rather than a hard-coded value. - In
.github/workflows/integration.yml, add an environment variableDEFAULT_BRANCHto thearcherycommand in the "Execute Docker Build" step, so thatarcherycan reliably access the default branch value. - In
.github/workflows/archery.yml, add an environment variableDEFAULT_BRANCHfor all steps. This environment variable was already used by theGit Fixupstep. It will also be used by theArchery Unittestsstep. - Add a property,
default_branch_name, to theRepoclass indev/archery/archery/crossbow/core.pyfor computing the default branch name.- If specified, the
DEFAULT_BRANCHenvironment variable, takes precedent in determining the default branch name (this is for qualifying in CI). - Otherwise,
pygit2is used to get the default branch name via the Apache Arrow repository'soriginremoteHEADreference. This is a heuristic, but in most cases, theHEADreference of the remote points to the default branch.
- If specified, the
- Add a cached property, default_branch to the
Releaseclass indev/archery/archery/release/core.pyfor computing the default branch name. Similar to thedefault_branch_nameproperty forRepoinarchery/archery/crossbow/core.py:- If specified, the
DEFAULT_BRANCHenvironment variable, takes precedent in determining the default branch name (this is for qualifying in CI). - Otherwise, similar to the previous step,
GitPythonis used to get the default branch name via the Apache Arrow repository'soriginremoteHEADreference.
- If specified, the
Out of scope:
- There are remaining instances of
masterin the test fixtures files indev/archery/archery/test/fixtures. It appears that the data only refers to external repositories, such asursa-labs/ursabot, which currently usesmaster, so these instances were not modified.
Testing
- Ran the
archeryandcrossbowcommands in local clones of both themathworks/arrowandapache/arrowrepositories. - Confirmed that the GitHub CI jobs pass.
- We are unsure how to locally qualify the changes to the
releasecomponent, but thereleasetests pass in CI.
Future Directions
- Added Jira task to update the pull request merge script to work with both
masterandmain(ARROW-17777)
Notes
Thank you @kevingurney for your help with this pull request!
Thanks for opening a pull request!
If this is not a minor PR. Could you open an issue for this pull request on JIRA? https://issues.apache.org/jira/browse/ARROW
Opening JIRAs ahead of time contributes to the Openness of the Apache Arrow project.
Then could you also rename pull request title in the following format?
ARROW-${JIRA_ID}: [${COMPONENT}] ${SUMMARY}
or
MINOR: [${COMPONENT}] ${SUMMARY}
See also:
https://issues.apache.org/jira/browse/ARROW-15691
This change is ready for review and ready to be considered for running the CI workflows that are awaiting approval. Thank you for your help on this!
Also @raulcd FYI
Thanks @pitrou for your code review, I've addressed your feedback in the latest commits.
@github-actions crossbow submit example-cpp-minimal-build-static
Revision: 374f5401f062d0a0247154e6833baf309cee9947
Submitted crossbow builds: ursacomputing/crossbow @ actions-05d3f08a75
| Task | Status |
|---|---|
| example-cpp-minimal-build-static |
Thank you @lafiona !
Benchmark runs are scheduled for baseline = 21564cf3981a1da0662ccd495320a5127b693a1f and contender = 8861c0c8b2ac8193c5112e7b7c8ab0c2ea33a6ff. 8861c0c8b2ac8193c5112e7b7c8ab0c2ea33a6ff is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished :arrow_down:0.0% :arrow_up:0.0%] ec2-t3-xlarge-us-east-2
[Failed :arrow_down:0.0% :arrow_up:0.0%] test-mac-arm
[Finished :arrow_down:0.0% :arrow_up:0.0%] ursa-i9-9960x
[Finished :arrow_down:0.36% :arrow_up:0.0%] ursa-thinkcentre-m75q
Buildkite builds:
[Finished] 8861c0c8 ec2-t3-xlarge-us-east-2
[Failed] 8861c0c8 test-mac-arm
[Finished] 8861c0c8 ursa-i9-9960x
[Finished] 8861c0c8 ursa-thinkcentre-m75q
[Finished] 21564cf3 ec2-t3-xlarge-us-east-2
[Failed] 21564cf3 test-mac-arm
[Finished] 21564cf3 ursa-i9-9960x
[Finished] 21564cf3 ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java