tenancy icon indicating copy to clipboard operation
tenancy copied to clipboard

When a migration crashes, you can't rollback

Open crazynds opened this issue 1 month ago • 6 comments

Bug description

I had a problem that I forgot to put nullable in a collumn when I have changed it. Só the firsts two tenants it just worked fine because they didnt have any entry with null, but the third one crashed. So to rollback it was a pain to do it.

I think it should rollback automatically all tenants it applied the migration or make the rollback easier to do.

Steps to reproduce

Just create a migration that work for the first tenant and crash to the next one.

Expected behavior

It should just rollback the tenants that migrate

Laravel version

12

stancl/tenancy version

latest

crazynds avatar Nov 26 '25 20:11 crazynds

Rolling back all tenants doesn't sound like a good default, migrations can fail for multiple reasons especially in multi-tenant contexts. You can use php artisan tenants:rollback --tenants=... to roll back individual tenants. What feature would you like us to add exactly?

stancl avatar Nov 26 '25 20:11 stancl

The problem is that when I run the migration and an error occurs after 3 tenants have migrated successfully, I end up with tenants in different states. Maybe an automatic rollback is not feasible, but if I run php artisan tenants:rollback, it should only roll back the tenants for which the migration actually ran successfully.

With a small number of tenants, the --tenants=... flag could be used, but if the error happens halfway through the tenants, the time lost would be bad.

I think something like php artisan tenants:rollback --sync, which checks the current migration state of each tenant and only rolls back the tenants that have the highest migration, would be helpful.

crazynds avatar Nov 26 '25 21:11 crazynds

A sync option sounds nice from the user's perspective but could be error-prone and dangerous to implement (rollback is too much of a risky thing to do automatically, as in the code determining which tenants to roll back — what if there's already some data and we'd end up dropping a table). I'm also trying to think of cases where data rather than infrastructure issues like connecting to some tenant DBs but not others failing could lead to differing behavior between tenants when running migrations.

One option I see here is to just notice that a migration in any tenant failed (say the third one and every tenant after that, meaning it only worked in the first two) and generate a command for the user that would roll back the migration in the tenants where it succeeded so they're all consistent. But that also seems risky since I believe in Laravel migrations aren't necessarily atomic and may fail halfway through, therefore you wouldn't be getting things consistent necessarily — only one of the steps of the migration may have failed and then the idea of running a rollback in other tenants to get things consistent wouldn't hold since some tenants could still be partially migrated.

It's also worth noting that the migration process stops on first error unless --skip-failing is passed. This is true even with parallel migrations though there's no communication between the processes so each process will run for as long as it doesn't reach a tenant where the migration fails.

Another consideration is that many people simply prefer only migrating forward partly because of the things I outlined here — a failing migration (in some tenants) isn't always a cleanly, automatically recoverable state. You may need to take a closer look at what exactly happened and then you can see how to go forward from there.

I think for these reasons I'd prefer that when people encounter failing migrations for some tenants, they simply edit the migration (that way it will only run on tenants for which it didn't fully succeed) as needed to address the specific issue that was encountered.

The main thing I can see us providing here is tenant-specific maintenance mode "wrapping" the migration process in this command such that if your migrations fail for say 3/500 tenants, the 3 tenants will not be able to access the app while you work on a fix but the remaining 497 tenants can use the app as usual.

stancl avatar Nov 27 '25 02:11 stancl

As for my case, I realy need to roll back:

  • I have a string column with size X, and I create a migration to increase the size of this column to Y, but I didnt put the nullable property, so when I got the error, some of the tennats have the column not nullable (the ones that the migration doesn't throw an error). If I fixed the migration and tried to run it again, It would only skip the tenants that already runned, making them different from the others.

I found a workarround to just update the name of the migration so it could be runned again in all the tenants.

I was thinking about something just to make the tenants syncronized again, even if I need to run a command on my own risk, because in this state, if I run the rollback command, it would just rollback the 3 tenants to a state before that migration, but for all the other tenants it would rollback to a state of an older migration, with potential risk os losing data.

I have another scenario that it would be important: Instead of applying the migration to 3 tenants, it applied to 497 and needed to rollback all of them? The time the application would be offline would be disastrous.

I know this command wouldn't be a lifesaver for all scenarios and would involve several risks, but I believe that simply having the command to rollback only the highest migration (if present in the tenant) could be the solution for some very specific cases.

crazynds avatar Nov 27 '25 03:11 crazynds

Isn't it easier to make a new migration to correct the error? You can run it on all tenants, existing and new ones.

return new class extends Migration
{
    public function up(): void
    {
        Schema::table('table_name', function (Blueprint $table) {
            $table->string('column_name')->nullable()->change();
        });
    }

    public function down(): void
    {
        Schema::table('table_name', function (Blueprint $table) {
            $table->string('column_name')->nullable(false)->change();
        });
    }
}; 

erikscheepers avatar Dec 04 '25 20:12 erikscheepers

No because this migration would not run if the one before that crashed.

Maybe I could delete the old migration and just make the new one run, but the old migration still would be in the migrations table on db in some tenants.

crazynds avatar Dec 04 '25 22:12 crazynds