rails icon indicating copy to clipboard operation
rails copied to clipboard

Batch reset sequence

Open gmcgibbon opened this issue 2 months ago • 7 comments

Motivation / Background

This Pull Request has been created because the sequence resetting we use in fixture code is a little slow when you have a lot of tables. On Shopify, it currently takes multiple seconds. Some of this information can be derived from the schema cache, or assumed if we know the application is using default sequence names.

Detail

This Pull Request changes the postgres adapter by introducing reset_column_sequences! to batch reset column sequences. Due to the complex nature of the queries used to get this info, and the fact that most of this data can be found in the schema cache, I've opted to just expose a batching method with all values provided. Later on, we might want to change the fixture implementation to do this faster using this method. This is all nodoc anyway, so the risk here is relatively low.

Checklist

Before submitting the PR make sure the following are checked:

  • [x] This Pull Request is related to one change. Unrelated changes should be opened in separate PRs.
  • [x] Commit message has a detailed description of what changed and why. If this PR fixes a related issue include it in the commit message. Ex: [Fix #issue-number]
  • [x] Tests are added or updated if you fix a bug or add a feature.
  • [ ] CHANGELOG files are updated for the changed libraries if there is a behavior change or additional feature. Minor bug fixes and documentation changes should not be included.

gmcgibbon avatar Oct 17 '25 22:10 gmcgibbon

Is it just postgres which is affected by this?

zzak avatar Oct 17 '25 23:10 zzak

Yes, mysql and sqlite don't have sequences, only postgres. We reset sequences when inserting fixtures to reset auto incrementing values so that we don't accidentally increment to an already exiting value. We reset sequences for postgres only by using respond_to?.

gmcgibbon avatar Oct 17 '25 23:10 gmcgibbon

It definitely sounds like there's a potential improvement here, but this doesn't look like a complete change in its current state.

matthewd avatar Oct 23 '25 19:10 matthewd

We need to batch pk_and_sequence_for and the min / max value queries. I'll work on that next, but it will take a little more effort to get right.

gmcgibbon avatar Oct 23 '25 19:10 gmcgibbon

I've added a proper batching solution, hooked it up to fixtures, and made sure it backfills data lazily. This should be much faster for apps with lots of tables. I will draw a quick benchmark.

gmcgibbon avatar Oct 25 '25 06:10 gmcgibbon

On an app with ~50 active records, you can see a slight difference: Before: Screenshot 2025-10-25 at 2 06 47 AM After: Screenshot 2025-10-25 at 2 06 38 AM This really only becomes an issue when you have more than 500 models. Then, the current approach can take a few seconds.

gmcgibbon avatar Oct 25 '25 07:10 gmcgibbon

@matthewd when you have a minute, can let me know if you would support making execute_batch nodoc public? It seems similar to the execute method, but it is currently private.

gmcgibbon avatar Dec 09 '25 08:12 gmcgibbon