datafusion icon indicating copy to clipboard operation
datafusion copied to clipboard

Enable merge queue in github to avoid commit confliction.

Open liurenjie1024 opened this issue 2 years ago • 10 comments

Describe the bug

See https://github.com/apache/arrow-datafusion/pull/6875 for details. We can enable merge queue to avoid such kind of conflicts. You can find out details here: https://docs.github.com/en/repositories/configuring-branches-and-merges-in-your-repository/configuring-pull-request-merges/managing-a-merge-queue

To Reproduce

No response

Expected behavior

No response

Additional context

No response

liurenjie1024 avatar Jul 07 '23 10:07 liurenjie1024

This would be awesome to do -- thank you for the suggestion @liurenjie1024

The repository settings are controlled by https://github.com/apache/arrow-datafusion/blob/main/.asf.yaml

Perhaps someone can research how to change that file to enable the merge queue

alamb avatar Jul 10 '23 11:07 alamb

Cool, let me have a try.

liurenjie1024 avatar Jul 10 '23 11:07 liurenjie1024

Oh, it seems that only apache committers can view the infra page: https://cwiki.apache.org/confluence/display/INFRA/Git+-+.asf.yaml+features and I can't help much. Anyway, merge queue is a great way to reduce merge conflicts and it would be helpful to enable that.

liurenjie1024 avatar Jul 12 '23 03:07 liurenjie1024

Seems that merge queue is not supported by apache git infra 😂, let's close this first.

liurenjie1024 avatar Jul 12 '23 10:07 liurenjie1024

😢

alamb avatar Jul 12 '23 16:07 alamb

I propose we re-open this since this is clearly a feature that we want, it's just blocked by ASF at the moment.

crepererum avatar Jan 18 '24 09:01 crepererum

The docs and the relevant management code are now openly available:

https://github.com/apache/infrastructure-asfyaml

So whoever wants to make this a reality for DataFusion can start there.

crepererum avatar May 16 '25 10:05 crepererum

Based on ASF Slack, I believe MQ aren't currently supported in .asf.yaml because there's no API support (https://github.com/orgs/community/discussions/50893)

blaginin avatar Jun 03 '25 12:06 blaginin

Based on ASF Slack, I believe MQ aren't currently supported in .asf.yaml because there's no API support (github.com/orgs/community/discussions/50893)

Following https://github.com/orgs/community/discussions/50893#discussioncomment-7792552 and https://github.com/orgs/community/discussions/77614#discussioncomment-10598205 , I think this MIGHT be outdated.

crepererum avatar Jun 12 '25 12:06 crepererum

Image

We ll need to make CI ready for the change and then open an email thread, will do that

blaginin avatar Aug 10 '25 15:08 blaginin

We ll need to make CI ready for the change and then open an email thread, will do that

I think the mailing list thread should basically say "we are discussing enabling the merge queue" and point at this ticket

alamb avatar Aug 11 '25 12:08 alamb

Mailing List note was posted: https://lists.apache.org/thread/2kxhfgv10vhqt3y63lxwcjtlhxc55c4y ✅

I'd like to test it first with a smaller repo (datafusion-sqlparser-rs) and move to the main one if everything works fine.

@iffyio, since this would impact you the most, being the most active maintainer of sqlparser-rs, would you be willing to try having merge queues enabled for that repo?

alamb avatar Aug 20 '25 10:08 alamb

@alamb yeah we can enable it on the sqlparser repo!

iffyio avatar Aug 20 '25 11:08 iffyio

Great infra team has given us temporary access to merge things bypassing the CI checks

Image

Hence we should be able to tests the queues in sqlparser-rs 🙂: https://github.com/apache/datafusion-sqlparser-rs/pull/2007

For transparency, here's the migration plan (https://issues.apache.org/jira/browse/INFRA-27154?page=com.atlassian.jira.plugin.system.issuetabpanels%3Aall-tabpanel)

Image

blaginin avatar Sep 01 '25 18:09 blaginin

From the mailing list

This should hopefully help resolve some of the problems we have now:

  • sometimes you want to approve a PR, but CI hasn't finished yet, so you have to wait up to 30 minutes before you can hit "merge"
  • the "merge" button may be available, but if the target branch was updated in the meantime, CI may then fail

The first should rarely be an issue in a distributed OSS projects. I've seen folks deliberately delaying merge by hours or days to give others a chance to chime in before the PR is merged.

How big of the problem is the second bullet?

Also, do we have any flaky tests? How will flakiness interact with merge queue?

not saying we should try it out. rather, trying to understand how important is the problem vs potential cost.

findepi avatar Sep 04 '25 07:09 findepi

the "merge" button may be available, but if the target branch was updated in the meantime, CI may then fail

This "may" happen, but TBH I've rarely seen this. On the positive side as a maintainer I can just hit the queue button and walk away without having to worry about semantic/test conflicts (i.e. the ones not detected by GIT) on the target branch. That's also why the merge queue is rather efficient by default because it can test multiple entries of the queue, assuming that the previous entries will pass CI. If that assumption fails, entries will be retested with the failed entry removed.

What I am wondering though: what happens to commits in PR that are added/pushed AFTER the PR was added to the queue? IIRC that removes the PR from the queue (as one might expect, because we don't want people to sneak in changes) but I cannot find any first-party sources for that behavior.

Also, do we have any flaky tests? How will flakiness interact with merge queue?

I'm not aware of any and I think we should be rather strict with them if they come up. Flaky tests just act like normal test failures.

crepererum avatar Sep 04 '25 08:09 crepererum

@blaginin has made a PR to SQLParser to test out how it works:

  • https://github.com/apache/datafusion-sqlparser-rs/pull/2007

alamb avatar Sep 04 '25 11:09 alamb

Merge Queue is working in SQLParser 🥳

Image Image

The plan is to test it for a few days, merge some PRs, and then switch to Datafusion if contributors will be happy with the experience

blaginin avatar Sep 05 '25 07:09 blaginin

Marked some steps as required in the main repo and asked ASF to enable merge queues for us here as well 🙂

blaginin avatar Sep 10 '25 22:09 blaginin

so exciting!

alamb avatar Sep 10 '25 23:09 alamb

MQ just got enabled, I believe that should put the required checks in place https://github.com/apache/datafusion/pull/17538

blaginin avatar Sep 17 '25 02:09 blaginin

Thank you for pushing this along @blaginin (and debugging the issues that arise)

alamb avatar Sep 17 '25 17:09 alamb

Thank you! After https://github.com/apache/datafusion/pull/17629 is merged, I'll go through all the PRs and merge the once that got approved and we wanted to merge them but couldn't because of the check

blaginin avatar Sep 17 '25 17:09 blaginin

Also, I think in the future we need to create a [private?] sandbox project for experiments within the ASF GitHub org. I created my own, but apparently there are organization-level rules that are impossible to test if you just fork the repo :(

blaginin avatar Sep 17 '25 17:09 blaginin

yeah, I agree testing stuff on github is tough

I think we can create our own datafusion repos within the apache org via some self service API -- how about apache/datafusion-sandbox ?

https://infra.apache.org/version-control.html#create

alamb avatar Sep 17 '25 18:09 alamb

datafusion-sandbox is great! also i guess we should make it private not to confuse potential users

Image

I'm not sure I have permissions to create that, do you?

blaginin avatar Sep 17 '25 18:09 blaginin

I'm not sure I have permissions to create that, do you?

Yes, I think the PMC chair needs to request it. I will do so.

I think it would be better if it is public with a clear README

alamb avatar Sep 17 '25 18:09 alamb

https://github.com/apache/datafusion-sandbox is now available

alamb avatar Sep 17 '25 18:09 alamb

Thank you! My plan is to fork the main project there, but instead of readme put a big warning that you should never use this project, this is our CI sandbox

blaginin avatar Sep 17 '25 19:09 blaginin