efcore icon indicating copy to clipboard operation
efcore copied to clipboard

Squash migrations

Open rowanmiller opened this issue 10 years ago • 98 comments

It would be good to have the ability to squash several migrations into a single file to help reduce the number of files in a project.

We probably want to keep track of the original list of migration names so that we can reason about this when targeting an existing database that the original migrations were applied to in their un-squashed form.

rowanmiller avatar May 12 '15 21:05 rowanmiller

@rowanmiller how will squash work? Merge Migrations into one or simply place them in a single file.

I think merging will be problematic when a target database has partial migrations but not all from a squash.

popcatalin81 avatar May 14 '15 12:05 popcatalin81

@popcatalin81 I suspect at first, it will simpy concatenate all the operations together into one migration. In the future, it may try and simplify the operations (e.g. renaming A -> B -> C will become just A -> C)

Correct, "rewriting history" is always a bad idea. Before squashing, you'll have to revert all the migrations you want to squash, squash them, then re-apply the new one. You shouldn't do it if the migrations have been applied on any database other than your local one.

This operation would be useful while developing a new feature. You could add all the migrations locally you want, but before merging your feature, you could squash them all down into a single migration.

bricelam avatar May 14 '15 16:05 bricelam

:+1: For this idea

TheMadKow avatar Dec 08 '15 14:12 TheMadKow

I just wanted to suggest that idea also.. the migrations folder gets quite large quite fast if the projects develops over time

markusvt avatar Feb 03 '16 14:02 markusvt

I was wondering if removing them all and creating an "initial migration" would be a better approach. In the end this is what I did recently. Of course the "initial migration" should be executed only on database creation. This will not only reduce the number of files in the project but it will also speed up the initial database creation if you recreate it multiple times e.g. for development and testing purposes. What do you guys think? @rowanmiller @bricelam

dario-hd avatar May 03 '17 19:05 dario-hd

I'm puzzled that this issue is so inactive. How are others solving the issue of the ever growing Migrations folder?

Could there be at least some best practice described in the documentation @AndriySvyryd ?

pgrm avatar Nov 15 '17 14:11 pgrm

@pgrm I briefly mentioned a strategy in the Migrations docs I'm adding...

bricelam avatar Nov 15 '17 16:11 bricelam

This is a common problem we run into every once in a while. It's quite simple to accomplish. If your database is already up to date just delete all the migration files and truncate the dbo.__EFMigrationsHistory table. Generate a new initial create migration and you have now squashed all your migrations. You lose any comments but that's minor if you're needing to do it.

replaysMike avatar Mar 29 '18 05:03 replaysMike

@replaysMike It will lost my custom migration operations. (For example, I set a custom default value for a new field.)

PMExtra avatar Apr 12 '19 02:04 PMExtra

@PMExtra that’s surprising since you’re basically creating a migration based on the current state of the database. Is the default value being applied at the db level, or code level when the entity is created?

replaysMike avatar Apr 14 '19 05:04 replaysMike

Is this still open or closed? We often end up with way too many migration files in VS solution explorer, to resolve this its a pain in the... we manually run the migrations on a clean database, take the schema and data inserts and extract those and literally create a new migration that becomes the new seed, this way we don’t loose history and don’t have to truncate the migration table. What would be really cool is if we had a command that would reverse engineer a seed migration from a given database , this would save us a couple hours every month or so. Thank you - let’s make this happen!!!!!

Btw database snapshots is a little overkill?? Might have some interesting application though??? I’m thinking integration testing?

deanvr avatar May 14 '20 02:05 deanvr

I usually manually "squash" migrations into new InitialCreate when a new system goes into production, removing all the develop time migrations as they usually aren't very helpful to reason about model changes.

But changes to the model after the system is in production are very important.

I don't think the squashed migration should just be a concatenation of the existing migrations, as one might add a column/table and another might remove it again, it's just noise. It should be a fresh InitialCreate migration. Nice, clean and ready to run on the production environment without any develop time noise.

But this is already possible today, with possible data lose on the existing DEV environments.

I'd welcome a squash feature to make this easier and avoid data lose.

snebjorn avatar Jul 14 '20 07:07 snebjorn

@snebjorn if your goal is to squash development-time migrations which haven't yet been applied to production, it's pretty easy to simply reset your Migrations folder and then generate a single new one, which would contain all the changes - you can do that before merging your change.

roji avatar Jul 14 '20 09:07 roji

Any plans to automate this in any release? The manual squashing is really frustrating. And at some point, Azure DevOps is failing due to huge migrations dll.

yakeer avatar Sep 17 '20 14:09 yakeer

@yakeer This issue is open and on the backlog, which means that we do intend to implement this in a future release. See release planning for more details.

ajcvickers avatar Sep 17 '20 19:09 ajcvickers

The latest VS release is extremely slow to compile migrations for example so this squashing might help. See https://developercommunity.visualstudio.com/content/problem/1253845/visual-studio-2019-version-1680-very-slow-to-build.html

clement911 avatar Nov 14 '20 10:11 clement911

@clement911 that issue should hopefully get resolved really soon on the VS side, so shouldn't require waiting for migration squashing in EF.

roji avatar Nov 14 '20 15:11 roji

@clement911 same issue here... going to attempt to manually squash my migrations now.

RCTycooner avatar Nov 30 '20 15:11 RCTycooner

@RCTycooner this was fixed in the latest VS release 16.8.2

clement911 avatar Nov 30 '20 15:11 clement911

@clement911 what's the status of issue for build pipelines? We're using hosted agents that should be getting the latest versions of everything, but still has this issue. Will update my VS to 16.8.2 and try a local build anyway

RCTycooner avatar Nov 30 '20 15:11 RCTycooner

Would squashing preserve any custom code that was added to the migration?

domagojmedo avatar Jan 13 '21 20:01 domagojmedo

I recently squashed a lot of migrations in our project (20+ person dev team). I hope this summary can help you or someone else with this need. See https://www.bokio.se/engineering-blog/how-to-squash-ef-core-migrations/ for more details + scripts we used.

The largest hurdle I found doing this is that the migrations snapshots are actually "corrupt" in EF Core under certain circumstances. If you need me to dive more deeply into this just ping me.

Extract from the blog post:

This can happen when your team create migrations in parallel on different branches a basic flow is something like this.

Version Parent is on the dev branch.
Version B is branch from Parent and adds a migration.
Version C is branch from Parent and adds a migration.
Version D is B & C merged back into dev.

If we look at the global snapshot: 
B contains Parent + B
C contains Parent + C
D contains Parent + B + C

This is great. The problem is that no migration snapshot contains Parent + B + C. 

So if you pick B or C as you snapshot you will lose some information. Most of the time this is not a huge issue but EF uses these migrations snapshots to decide specificity of columns and if that needs to update. So I suddenly saw issues where on column had varchar(MAX) in the new version of the database but varchar(50) in the old one.

The issue this caused for me when squashing migrations was that I never could find a migration with a good migration snapshot to base it on.

So instead I had to take one of two approaches:

  • Use source control to pull the global snapshot from the time you want to (not super easy to find)
  • Manually merge the migration snapshot between the problematic ones.

If you find a solid solution for this issue squashing would actually be decently easy to script.

MikaelEliasson avatar Jan 14 '21 08:01 MikaelEliasson

What I end up doing most of the time is delete the migrations and re add them to a fresh DB(almost identical to one of the articles mentioned in this thread, seeing how multiple people have a similar approach might be useful path to pursue) and then regenerate a single 'initial migration' script. One thing that would be extremely valuable in this situation would be to have a script that would do this for me(maybe have it optionally target an in memory provider for simplicity sake, since in 99% we're targeting a fresh DB instance) and insert a sort of a history breakpoint between different migrations in the Up()/Down() methods.

For example if we had 2 migrations before the squash, InitialMigration and AddCustomEntity, we'd have a migrationBuilder.InsertHistory("AddCustomEntity") in between those 2 migrations in the newly merged file and this would(optionally) trigger a transaction commit for previous migration.

This would be extremely useful, since the various deployments we cover usually have partial migrations applied due to various factors and we'll prolly never be able to bring them all to same verison.

Dasein732 avatar Feb 04 '21 17:02 Dasein732

Just an idea but would this work?

  1. Delete all migrations in the project, essentially reset
  2. Create just the init migration
  3. generate the idempotent SQL script
  4. in the database, remove everything in the MigraitionsHistory table
  5. Run just the last part of the SQL script, which should add the one and only migration info to the MigrationHistory table

If my understanding is correct, this should just work right?

worthy7 avatar May 12 '21 02:05 worthy7

@worthy7 That probably won't work as expected if there were custom logic added to the migrations.

https://github.com/dotnet/efcore/issues/2174#issuecomment-482414336 It will lost my custom migration operations. (For example, I set a custom default value for a new field.)

gojanpaolo avatar May 12 '21 03:05 gojanpaolo

Ah, of course. I knew I was missing something.

worthy7 avatar May 12 '21 06:05 worthy7

Just to leave a small update on my answer above. https://github.com/dotnet/efcore/issues/2174#issuecomment-760022208

We ran this for a second time. And saved 1.8 million LOC (73MB of source code). This time it took about 60 min work to squash all the migrations, validate that the db was identical + add the test below.

We did run into a tiny issue though. The prep migration we created last time was included in the new initial migration. And it doesn't work in that context (duplicate key). We just commented it out.

image

To prevent any team mate merging an earlier migration from a long lived branch we also added a small test that will block that in the PR tests.


        [Fact]
        public void NoMigrationShouldBeOlderThanSnapshot()
        {
            /**
             * When we squash changes we need to make sure no migrations before the squash is merged in.
             * 
             * If you have a migration like that you need to remove it on your branch and re-run the add-migration
             * so you get a newer timestamp. (And please merge newer changes into your branch first too) 
             */

            var currentActiveSnapshot = "20210826073610_Squash2_prep";

            TC.New();
            using (var context = DB.NewUnsafe())
            {
                var migrationsAssembly = context.GetService<IMigrationsAssembly>();

                var earlierMigrations = migrationsAssembly.Migrations.Where(x => x.Key.CompareTo(currentActiveSnapshot) < 0).Select(x => x.Key).ToList();
                earlierMigrations.ShouldBeEmpty("There are migrations that are older than latest squash.");
            }
        }

MikaelEliasson avatar Nov 15 '21 13:11 MikaelEliasson

image

voroninp avatar Jul 27 '22 08:07 voroninp

Even a partial support witch would cover only the simple cases where there is no custom code required for the initial initialization would be useful. Like an automation of the workflow described by worthy7. Giving us no automated solution even for simple cases make it always error prone, and make it by itself a defect in EF Core IMO.

begerard avatar Aug 22 '22 18:08 begerard

Just to be sure everyone is on the same page, especially for the simple cases with no custom migration code, there's a very easy workaround documented here; this is obviously not as perfect as a built-in squash feature, but the fact that a good workaround exists makes this less urgent.

The main problem IMHO is what happens when there is custom migration code, which gets lost if you simply delete the Migrations folder. In other words, I think there's very little value in implementing a "partial" squash feature which doesn't handle the custom code scenario.

In any case, this is feature which we definitely intend to implement at some point; it's simply a matter of competing priorities compared to other issues, which may not have a workaround as above. It's also pretty highly-voted, so there's a good chance we'll get to it sooner rather than later.

roji avatar Aug 22 '22 22:08 roji