etl
etl copied to clipboard
Feature request: Scheduled PR merges
Sometimes, it would be nice to schedule large PRs to merge overnight when no one is working.
PRs that involve large datasets can often block up the etl for a few hours, which can be problematic for others trying to work.
It looks like there are some ways to do it, described here.
+1! And +100 for finding the tool to do it. I thought we'd have to do it ourselves.
We should make sure that it only merges if CI is ✅ . It's possible that someone changes a chart in prod and chart-diff would start failing.
Looks like Lucas already merged something here:
- #1882
We should check that, and maybe replace it.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
ETL became much faster & robust since we opened this issue. It should be pretty safe to merge in the middle of the day now. It should take less than hour.
ETL became much faster & robust since we opened this issue. It should be pretty safe to merge in the middle of the day now. It should take less than hour.
@lucasrodes there are more ETL performance improvements on the way. I think we can get to a point where merging anytime is not a big deal, and we wouldn't need this. (Rebuilding full ETL takes ~1 hour now and could be sped up by using PREFER_DOWNLOAD=1 if you have a good connection)
yeah, that makes total sense! thanks for doing that
I would still like to explore this whenever I have some time, and have it be an option.
but based on earlier discussions, and all the performance improvements this a nice to have, and I shouldn't stress much about it :p
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
This should be possible already with action merge-schedule-action, which I added to the project in https://github.com/owid/etl/pull/1882.
I tried it at https://github.com/owid/etl/pull/3563. I've added a section to our docs explaining this.
Overview of how it works:
-
Create the PR
-
Work on it
-
Set it as 'ready to review'
-
To set the schedule for 20th November 2024, add to the PR description:
/schedule 2024-11-20One can go more granular, with specific hours (UTC):
/schedule 2024-11-20T09:00:00.000ZOr set the merge for the next scheduled:
/schedule
Note: The scheduler runs every 30 minutes (sharp hour, half-past). This means that the PR won't be scheduled at a given specific time but at the next time that the scheduler is triggered after the scheduled time. Therefore, catch up on any PRs planned that might have been scheduled between scheduled runs.