arrow icon indicating copy to clipboard operation
arrow copied to clipboard

WIP: ARROW-15693: [Dev] Update crossbow templates to use master or main

Open kevingurney opened this issue 3 years ago • 23 comments

Note

  1. This pull request is a work in progress. Please do not merge yet.

Overview

This pull request:

  1. Removes hard-coded dependencies on "master" as the default branch name in the crossbow infrastructure and CI template files.

Implementation

  1. Add new default_branch property to Target class in core.py.
  2. Create a new jinja2 macro is_default_branch() which can be used to check if the current Git branch is the default branch.
  3. Replace hard-coded branch comparisons against "master" with is_default_branch() macro.
  4. Replace usages of hard-coded string "master" with "HEAD" or "-" in URLs.

Testing

  1. In Progress. wWll use @github-actions crossbow submit workflow for qualification

Future Directions

  1. Remove "master" from default_branch name property of Target class.
  2. Remove all remaining uses of "master" terminology in crossbow.

Notes

  1. Thank you to @lafiona for her help with this pull request!

kevingurney avatar Jul 29 '22 18:07 kevingurney

https://issues.apache.org/jira/browse/ARROW-15693

github-actions[bot] avatar Jul 29 '22 18:07 github-actions[bot]

:warning: Ticket has not been started in JIRA, please click 'Start Progress'.

github-actions[bot] avatar Jul 29 '22 18:07 github-actions[bot]

Looks good so far, thanks for working on this!

  • This pull request is a work in progress. Please do not merge yet.

  • We aren't using WIP in the pull request title because we need to run CI jobs for qualification.

You can use @github-actions crossbow submit with WIP in the title, that only stops the PR CI within apache/arrow from running. I would recommend converting to a draft PR so it is clear that this is not merge ready yet, this also has no influence on running crossbow jobs.

assignUser avatar Jul 29 '22 19:07 assignUser

@github-actions crossbow submit -g conan

kevingurney avatar Jul 29 '22 19:07 kevingurney

Revision: 233a27ef8307868eb527dbcbe210878ab1c13376

Submitted crossbow builds: ursacomputing/crossbow @ actions-d341aafc76

Task Status
conan-maximum Github Actions
conan-minimum Github Actions

github-actions[bot] avatar Jul 29 '22 19:07 github-actions[bot]

@github-actions crossbow submit -g conan

kevingurney avatar Jul 29 '22 20:07 kevingurney

Failed to render template `docker-tests/github.linux.yml` with UndefinedError: 'archery.crossbow.core.Target object' has no attribute 'default_branch'
The Archery job run can be found at: https://github.com/apache/arrow/actions/runs/2763089048

github-actions[bot] avatar Jul 29 '22 20:07 github-actions[bot]

@github-actions crossbow submit -g conan

kevingurney avatar Jul 29 '22 20:07 kevingurney

Revision: f26aa7c209ff977aa7e50349b95e7d9075a1599c

Submitted crossbow builds: ursacomputing/crossbow @ actions-685cf2b9ab

Task Status
conan-maximum Github Actions
conan-minimum Github Actions

github-actions[bot] avatar Jul 29 '22 20:07 github-actions[bot]

@github-actions crossbow submit -g conan

kevingurney avatar Aug 03 '22 18:08 kevingurney

Revision: b00f18a0971fde6978b800d13eca6239b12aba6f

Submitted crossbow builds: ursacomputing/crossbow @ actions-c3c8bb519d

Task Status
conan-maximum Github Actions
conan-minimum Github Actions

github-actions[bot] avatar Aug 03 '22 18:08 github-actions[bot]

@github-actions crossbow submit -g conan

kevingurney avatar Aug 03 '22 18:08 kevingurney

Failed to render template `docker-tests/github.linux.yml` with UndefinedError: 'archery.crossbow.core.Target object' has no attribute 'default_branch'
The Archery job run can be found at: https://github.com/apache/arrow/actions/runs/2791768663

github-actions[bot] avatar Aug 03 '22 18:08 github-actions[bot]

@github-actions crossbow submit -g conan

lafiona avatar Aug 04 '22 19:08 lafiona

Failed to render template `docker-tests/github.linux.yml` with UndefinedError: 'archery.crossbow.core.Target object' has no attribute 'default_branch'
The Archery job run can be found at: https://github.com/apache/arrow/actions/runs/2799031030

github-actions[bot] avatar Aug 04 '22 19:08 github-actions[bot]

@github-actions crossbow submit -g conan

lafiona avatar Aug 04 '22 19:08 lafiona

Failed to render template `docker-tests/github.linux.yml` with UndefinedError: 'archery.crossbow.core.Target object' has no attribute 'default_branch'
The Archery job run can be found at: https://github.com/apache/arrow/actions/runs/2799191626

github-actions[bot] avatar Aug 04 '22 19:08 github-actions[bot]

@github-actions crossbow submit -g conan

lafiona avatar Aug 04 '22 19:08 lafiona

Failed to render template `docker-tests/github.linux.yml` with UndefinedError: 'archery.crossbow.core.Target object' has no attribute 'default_branch'
The Archery job run can be found at: https://github.com/apache/arrow/actions/runs/2799237968

github-actions[bot] avatar Aug 04 '22 19:08 github-actions[bot]

@github-actions crossbow submit -g conan

lafiona avatar Aug 04 '22 20:08 lafiona

Failed to render template `docker-tests/github.linux.yml` with UndefinedError: 'archery.crossbow.core.Target object' has no attribute 'default_branch'
The Archery job run can be found at: https://github.com/apache/arrow/actions/runs/2799292578

github-actions[bot] avatar Aug 04 '22 20:08 github-actions[bot]

An update: We were initially attempting to test the changes in this pull request by running crossbow jobs via the github-actions bot. However, we observed that the changes to core.py from this pull request were not reflected in the jobs that ran in ursacomputing/crossbow as the script is cloned from the default branch.

Therefore, to qualify the changes, I followed the instructions on this page to set up a queue repository.

I created the repository here.

Initially, I submitted a job from the feature branch, ARROW-15693, but saw that the resulting branch did not have a .workflows/crossbow.yml file and did not initiate any GitHub Actions jobs.

To verify that the repository was set up correctly, I tried submitting a job from the main branch of the fork that I am working on, mathworks/arrow, using archery:

$ archery crossbow submit -g conan

The output to the command window does not indicate there are any errors, but I am wondering if the params, branch, and commit fields of the output should have populated values at this point.

!Job
target: !Target
  head: bc6c4988691cf60ecac67542b2daa2ac19fde5d9
  email: [email protected]
  branch: main
  remote: https://github.com/mathworks/arrow.git
  github_repo: mathworks/arrow
  version: 8.0.0.dev1114
  no_rc_version: 8.0.0.dev1114
  no_rc_semver_version: 8.0.0-dev1114
tasks:
  conan-minimum: !Task
    ci: github
    template: docker-tests/github.linux.yml
    artifacts: []
    params:
      image: conan

    branch:
    commit:
  conan-maximum: !Task
    ci: github
    template: docker-tests/github.linux.yml
    artifacts: []
    params:
      flags: >-
        -e ARROW_CONAN_PARQUET=True
        -e ARROW_CONAN_WITH_BROTLI=True
        -e ARROW_CONAN_WITH_GLOG=True
        -e ARROW_CONAN_WITH_JEMALLOC=True
        -e ARROW_CONAN_WITH_LZ4=True
        -e ARROW_CONAN_WITH_SNAPPY=True
      image: conan

  ########################### Python Minimal ############################

    branch:
    commit:
params: {}
branch:
Pushed job identifier is: `build-3`

On GitHub, I notice that the only branch that is pushed to the remote is build-3, which does not contain .workflows/crossbow.yml. Neither branches build-3-github-conan-minimum nor build-3-github-conan-minimum are pushed to the remote, as I would have expected (those branches are locally created). This leads me to believe that my queue repository is not set up properly.

Lastly, the scheduled jobs (for example) that are run in the repository have been failing. The error indicates the token is missing; it seems like it might need to be a token associated with apache/arrow.

@assignUser If you happen to have any suggestions for debugging this further, I'd appreciate you sharing your insight. Thank you!

lafiona avatar Aug 11 '22 20:08 lafiona

Lastly, the scheduled jobs (for example) that are run in the repository have been failing. The error indicates the token is missing; it seems like it might need to be a token associated with apache/arrow.

These workflows checkout apache arrow and install archery from there so you have to change those workflows to use your fork & branch so your modified version of archery is used. Similarly for the crossbow repo that is checked out.

Additionally you will have to create a PAT with repo scope for your crossbow fork and add it as a repo secret under CROSSBOW_GITHUB_TOKEN as the normal GITHUB_TOKEN does not trigger workflows when you use it to commit stuff (to prevent infinite workflow recursion).

I assume when trying to use crossbow submit locally you have set the CROSSBOW_GITHUB_TOKEN envvar and have checked out your forks of crossbow and arrow? I am not sure why it would not submit, as it is fetching by default it can take a while, you could fetch your repo manually and then be explicit and try this: archery crossbow --queue-remote https://github.com/lafiona/crossbow.git submit --no-fetch --commit --push -g conan

Hope this helps 🤞

assignUser avatar Aug 12 '22 14:08 assignUser

@assignUser, our sincere apologies in advance for the inconvenience. Somehow, while trying to revert some changes we made for debugging purposes during qualification, we managed to get this pull request into a "Merged" state. Now, we can't seem to figure out how to get it back into the "Draft" state, so that we can push commits.

Are you aware of any way of resetting the pull request state? Or, should we just open a new pull request?

kevingurney avatar Aug 24 '22 15:08 kevingurney

hi! @kevingurney ! I didn't know it was possible to set a PR into merged state without a real merge happening.

I would recommend to open a new PR at this stage. Also, if you could help me understand how the PR got into merge state, there might be something with the workflow that isn't quite right as this should not happen. Thanks!

raulcd avatar Aug 24 '22 15:08 raulcd

@raulcd - Thank you so much for your help! Yes, this was quite surprising to us, as well! Per your recommendation, I will go ahead and open a new pull request. Sorry again for this unexpected behavior!

I believe the steps that led to this were something like the following:

  1. We were pushing commits "as usual" to mathworks/arrow:ARROW-15693. They were showing up in the Pull Request interface as expected.
  2. When we went to rebase our changes on top of the latest changes available in mathworks/arrow:main (using git rebase main), to our surprise, we ended up having 100+ commits that had diverged from upstream. This is when we realized something wasn't right. Unfortunately, we don't know exactly what we did along the way that led to this diverged state.
  3. To move forward, we wanted to find a way to get out of the diverged state, before having someone else review our final changes. To achieve this, we thought we could just "reset" our feature branch to the state of mathworks/arrow:main, and then re-apply our changes as a single commit using git apply <patch>.
  4. To "reset" our feature branch to the state of mathworks/arrow:main, we created another branch off of mathworks:arrow:main, and then set this local branch to track the upstream mathworks/arrow:ARROW-15693 branch. To do this, we used the command: git branch --set-upstream-to origin/ARROW-15693.
  5. Finally, to push the "reset" branch to GitHub, we ran git push origin HEAD:ARROW-15693 --force.

As soon as we completed step 6., the pull request went into a "Merged" State.

I hope that helps provide some insight into what happened here! Please don't hesitate to let me know if you need more information, or if anything above is unclear. I am happy to help debug this further if that would be useful.

kevingurney avatar Aug 24 '22 18:08 kevingurney

Thanks @kevingurney for the detailed explanation. It seems you pushed to the branch a commit and history that was already on master. As there were no divergences between your branch and master GitHub marked that as merged as it was already on master.

I am not entirely sure what would happen if you push new changes to that branch, maybe GitHub allows to reopen the PR but I do think creating a new PR is a good approach.

raulcd avatar Aug 25 '22 09:08 raulcd

That makes sense - I'll open another pull request. Thanks again, @raulcd!

kevingurney avatar Aug 25 '22 13:08 kevingurney

To close the loop - we've opened https://github.com/apache/arrow/pull/13975 as a follow up to this pull request.

kevingurney avatar Aug 25 '22 15:08 kevingurney