dbal icon indicating copy to clipboard operation
dbal copied to clipboard

MySQL + deadlock leads to `SAVEPOINT DOCTRINE_x does not exist`

Open nikophil opened this issue 1 year ago • 25 comments

Bug Report

Q A
Version any

Summary

Hello,

this issue is a follow up of https://github.com/doctrine/orm/issues/11230: in MySQL, when a deadlock occurs because of concurrency, the system tries to rollback, and then an error "SAVEPOINT DOCTRINE_x does not exist" occurs.

At first, the savepoint error did hide the original deadlock error. This has been fixed thanks to https://github.com/doctrine/orm/pull/11646 - and this will soon be fixed as well in symfony/messenger.

Nevertheless, https://github.com/doctrine/orm/issues/11230 has been closed, but the original problem is not fixed, and the rollback still fails, which results in closing the EM, and breaking the worker.

Since the dbal is the only one aware of savepoints, I think this should be fixed here.

Current behavior

When a deadlock occur, within a transaction with savepoints in MySQL, the rollback fails with SAVEPOINT DOCTRINE_x does not exist

Expected behavior

No error on roll back :sweat_smile:

From MySQL doc:

When deadlock detection is enabled (the default) and a deadlock does occur, InnoDB detects the condition and rolls back one of the transactions (the victim).

So maybe when a deadlock occurs, DBAL should be resilient when the rollback fails? Or it should check the existence of the given savepoint/transaction before rolling back?

nikophil avatar Dec 18 '24 13:12 nikophil

I am not sure how related this issue is; but I experience the following. I get the same error related to the doctrine:migrations:migrate command.

I am using symfony with doctrine. I have a symfony command that prepares the application. My installation command is pretty simple. I run: doctrine:migrations:migrate --no-interaction and in the same command I load an entity; persist and flush. This gives me the similair error: 1305 SAVEPOINT DOCTRINE_2 does not exist. I use Doctrine\ORM\EntityManagerInterface with dependency injection as variable entityManager. The initial database is empty.

$this->getApplication()->setAutoExit(false);
$this->getApplication()->run(new StringInput('doctrine:migrations:migrate --no-interaction'), $output);

$post = new Post();
$post->setTitle('this is a title');

$this->entityManager->persist($policy);
$this->entityManager->flush();

If I put $this->entityManager->getConnection()->isTransactionActive() after the migrations command, this returns true.

If I run the command twice, there is no error due to the fact that the migrations are already executed and up to date. Even if I add multiple new entities, persist and flush them there are no issues if I run the migrations first.

This issue only occurs after I updated from doctrine/dbal: 3.* to doctrine/dbal: 4.*.

I have create an repo to reproduce this issue: https://github.com/petermanders89/dbal-issue. This is repo created with symfony new my_project_directory --webapp. I modified it slightly.

When running the install command; in this case with symfony console app:install it'll throw this same error.

petermanders89 avatar Dec 20 '24 13:12 petermanders89

hey @petermanders89

if using MySQL, you should try to declare your migrations as "non transactional". In MySQL, "ALTER TABLE" statements, and all other statements which modify the db structure implicitly closes the transaction. Connection::isTransactionActive() does not check the DB to see if a transaction is actually active.

At the end, when doctrine commits the changes, it tries to commit a non-existent transaction, and your migration fails :boom: I think you're facing this problem with migration to dbal 4, because dbal 4 forces usage of savepoints, but I think it's actually not the same problem than the one I'm talking about.

nikophil avatar Dec 20 '24 13:12 nikophil

Hey @nikophil . Thank you! That indeed solves all my issues related to this. Sorry to mess up your issue.

petermanders89 avatar Dec 20 '24 13:12 petermanders89

@nikophil please provide a sample script or other instructions that will reproduce your issue. Ideally, it should use only DBAL as a dependency.

morozov avatar Dec 22 '24 20:12 morozov

Hi @morozov

here it is: https://github.com/nikophil/dbal-repro

you can run it by:

  • set the DSN of a MySQL connection in src/create-connection.php
  • run composer install
  • run php ./src/create-deadlock.php

I've somehow reproduced a concurrency problem by running twice src/update-line.php at the same time, using symfony/process

The concurrency creates a deadlock. On my machine, I have a deadlock, and an error on rollback every time I'm running the script.

The script outputs the error, basically:

PHP Fatal error:  Uncaught PDOException: SQLSTATE[42000]: Syntax error or access violation: 1305 SAVEPOINT DOCTRINE_2 does not exist in /home/nicolas/works/github.com/nikophil/dbal-repro/vendor/doctrine/dbal/src/Driver/PDO/Connection.php:27
Stack trace:
#0 /home/nicolas/works/github.com/nikophil/dbal-repro/vendor/doctrine/dbal/src/Driver/PDO/Connection.php(27): PDO->exec()
#1 /home/nicolas/works/github.com/nikophil/dbal-repro/vendor/doctrine/dbal/src/Connection.php(878): Doctrine\DBAL\Driver\PDO\Connection->exec()
#2 /home/nicolas/works/github.com/nikophil/dbal-repro/vendor/doctrine/dbal/src/Connection.php(1140): Doctrine\DBAL\Connection->executeStatement()
#3 /home/nicolas/works/github.com/nikophil/dbal-repro/vendor/doctrine/dbal/src/Connection.php(1080): Doctrine\DBAL\Connection->rollbackSavepoint()
#4 /home/nicolas/works/github.com/nikophil/dbal-repro/src/update-line.php(32): Doctrine\DBAL\Connection->rollBack()
#5 {main}

Next Doctrine\DBAL\Driver\PDO\Exception: SQLSTATE[42000]: Syntax error or access violation: 1305 SAVEPOINT DOCTRINE_2 does not exist in /home/nicolas/works/github.com/nikophil/dbal-repro/vendor/doctrine/dbal/src/Driver/PDO/Exception.php:28
Stack trace:
#0 /home/nicolas/works/github.com/nikophil/dbal-repro/vendor/doctrine/dbal/src/Driver/PDO/Connection.php(33): Doctrine\DBAL\Driver\PDO\Exception::new()
#1 /home/nicolas/works/github.com/nikophil/dbal-repro/vendor/doctrine/dbal/src/Connection.php(878): Doctrine\DBAL\Driver\PDO\Connection->exec()
#2 /home/nicolas/works/github.com/nikophil/dbal-repro/vendor/doctrine/dbal/src/Connection.php(1140): Doctrine\DBAL\Connection->executeStatement()
#3 /home/nicolas/works/github.com/nikophil/dbal-repro/vendor/doctrine/dbal/src/Connection.php(1080): Doctrine\DBAL\Connection->rollbackSavepoint()
#4 /home/nicolas/works/github.com/nikophil/dbal-repro/src/update-line.php(32): Doctrine\DBAL\Connection->rollBack()
#5 {main}

Next Doctrine\DBAL\Exception\DriverException: An exception occurred while executing a query: SQLSTATE[42000]: Syntax error or access violation: 1305 SAVEPOINT DOCTRINE_2 does not exist in /home/nicolas/works/github.com/nikophil/dbal-repro/vendor/doctrine/dbal/src/Driver/API/MySQL/ExceptionConverter.php:91
Stack trace:
#0 /home/nicolas/works/github.com/nikophil/dbal-repro/vendor/doctrine/dbal/src/Connection.php(1373): Doctrine\DBAL\Driver\API\MySQL\ExceptionConverter->convert()
#1 /home/nicolas/works/github.com/nikophil/dbal-repro/vendor/doctrine/dbal/src/Connection.php(1315): Doctrine\DBAL\Connection->handleDriverException()
#2 /home/nicolas/works/github.com/nikophil/dbal-repro/vendor/doctrine/dbal/src/Connection.php(880): Doctrine\DBAL\Connection->convertExceptionDuringQuery()
#3 /home/nicolas/works/github.com/nikophil/dbal-repro/vendor/doctrine/dbal/src/Connection.php(1140): Doctrine\DBAL\Connection->executeStatement()
#4 /home/nicolas/works/github.com/nikophil/dbal-repro/vendor/doctrine/dbal/src/Connection.php(1080): Doctrine\DBAL\Connection->rollbackSavepoint()
#5 /home/nicolas/works/github.com/nikophil/dbal-repro/src/update-line.php(32): Doctrine\DBAL\Connection->rollBack()
#6 {main}
thrown in /home/nicolas/works/github.com/nikophil/dbal-repro/vendor/doctrine/dbal/src/Driver/API/MySQL/ExceptionConverter.php on line 91

as you can see, the rollback throws an error, and the problem occurs for both dbal 3 et 4

nikophil avatar Dec 23 '24 14:12 nikophil

hey @morozov

by any chance, did you check the reproducer I gave?

Do you think we can fix this? I'd be willing to provide a PR, but I'd prefer that we agree on a solution first.

thanks 🙏

nikophil avatar Jan 21 '25 15:01 nikophil

@nikophil thanks for the reproducer. I was able to run it and observe the same behavior as described in the issue.

The question is, why do you believe this is a bug? In my understanding, a nested transaction attempts to roll back to a savepoint. This savepoint no longer exists because the whole transaction in the session where it was previously created was rolled back as a victim of a deadlock. So the attempt fails, which is reported via an exception. This looks correct.

If it doesn't matter to your business logic whether the transaction was rolled back to the savepoint or fully rolled back, why don't you catch this exception and do nothing?

morozov avatar Jan 21 '25 16:01 morozov

In my understanding, the underlying error (the deadlock) is hidden by the savepoint error, so it's quite hard to reason about the problem.

nikophil avatar Jan 21 '25 16:01 nikophil

So, if you're saying that "if a deadlock occurs in a nested transaction, a DeadlockException should be thrown to the user-land", then I think I agree.

morozov avatar Jan 21 '25 17:01 morozov

Yeah that's were I'd like to go :)

But what should we do if a save point error occurs? My opinion would be to totally ignore the savepoint error. The whole transaction is closed anyway 🤷‍♂️

But as far as I know this is only a mysql problem 🤔

nikophil avatar Jan 21 '25 17:01 nikophil

If a savepoint error occurs due to a deadlock, it should throw a deadlock exception. From the API consumer's standpoint, there are no savepoints, there are nested transactions, and we need to tell all consumers that started a transaction at any level that their transaction was rolled back due to a deadlock.

morozov avatar Jan 21 '25 17:01 morozov

If a savepoint error occurs due to a deadlock

I'm wondering how this can be achieved, there is no way to know what was the reason why rollback() was called. And I'm pretty sure no parameter can be added to this method 😅

And always muting the error from rollbackSavepoint() does not seem the way to go neither...

any hints about this?

nikophil avatar Jan 21 '25 20:01 nikophil

And always muting the error from rollbackSavepoint() does not seem the way to go neither...

That's for sure.

any hints about this?

I'm not really familiar with transactions in DBAL, so I don't have any immediate hints, and there isn't necessary an easy solution to this. If I were to implement this, I'd start with a space-time diagram (like this) and document the current behavior. The actors there would be:

  1. The component that starts the top-level transaction in the victim thread.
  2. The component that starts the nested transaction in the victim thread.
  3. The component that starts a transaction in the winning thread.

I'd documented the interaction between them in the form of method calls, returns and/or exceptions. Then it may become clear how the communication needs to change.

morozov avatar Jan 21 '25 21:01 morozov

well, it seems that it can't be fixed directly in the DBAL without a BC break 🤔 but maybe this can be mitigated in the callers.

In my case, the component which starts the transactions is symfony/messenger (through DoctrineTransactionMiddleware)

I think this could also be done in the ORM: here and here.

Those are the same places where https://github.com/doctrine/orm/issues/11230 and https://github.com/symfony/symfony/pull/59103 fixed the previous error where a rollback error did hide the original error.

any thoughts @greg0ire?

nikophil avatar Jan 22 '25 12:01 nikophil

@nikophil I'm not sure I understand the issue. Are you saying:

  1. there is a deadlock
  2. code inside the finally throws another exception about a savepoint that does not exist
  3. despite what I recently did, the deadlock exception is not visible as a "previous exception" of the exception mentioned in step 2. ?

greg0ire avatar Jan 22 '25 17:01 greg0ire

This issue also occurs when innodb_rollback_on_timeout is enabled

InnoDB rolls back only the last statement on a transaction timeout by default. If --innodb-rollback-on-timeout is specified, a transaction timeout causes InnoDB to abort and roll back the entire transaction.

When there is an error, doctrine tries to rollback: https://github.com/doctrine/orm/blob/aff82af7debf9c068c2e0bb4133e24a6635c070f/src/UnitOfWork.php#L447

                if ($conn->isTransactionActive()) {
                    $conn->rollBack();
                }

The function isTransactionActive looks like:

    public function isTransactionActive(): bool
    {
        return $this->transactionNestingLevel > 0;
    }

isTransactionActive() returns true but the transaction is not open (mysql closed it).

Then, when doctrine tries to rollback, the connection/driver throws the exception:

SQLSTATE[42000]: Syntax error or access violation: 1305 SAVEPOINT DOCTRINE_2 does not exist

The exception thown is: Doctrine\DBAL\Driver\PDO\Exception with message SAVEPOINT DOCTRINE_2 does not exist

Instead of: Doctrine\DBAL\Exception\LockWaitTimeoutException

Should the function isTransactionActive check in the native driver if the transaction is still open?

JoniJnm avatar Feb 05 '25 13:02 JoniJnm

Should the function isTransactionActive check in the native driver if the transaction is still open?

yeah maybe this could be a nice alternative 👍

Another problem with this issue is that when an error SAVEPOINT DOCTRINE_2 does not exist occurs, is closes the entity manager

despite what I recently did, the deadlock exception is not visible as a "previous exception" of the exception mentioned in step 2.

@greg0ire based on my reproducer (see https://github.com/doctrine/dbal/issues/6651#issuecomment-2559815200) the DriverException containing the savepoint error does not contain the original exception in its previous 🤷

nikophil avatar Feb 11 '25 15:02 nikophil

Hi, I’m experiencing similar behavior:

  1. Connection::transactional() does a rollback on exception in the finally block. If a deadlock happens in the query inside the closure, the transaction is already rolled back, which leads to a "1305 SAVEPOINT <NAME> does not exist" exception.

  2. It’s possible to catch both exceptions on the client side, but unfortunately the Connection::transactionNestingLevel ends up with the wrong value, which makes the connection unusable.

As a workaround, I reset the field using reflection, but that’s not a long-term solution.

In https://github.com/doctrine/dbal/pull/6103 another approach was used to fix a similar issue — maybe that’s a direction to revisit. The only concern is that some databases don’t roll back the whole transaction on deadlock, so behavior might differ.

@nikophil wdyt?

h1k3r avatar May 07 '25 14:05 h1k3r

The only concern is that some databases don’t roll back the whole transaction on deadlock, so behavior might differ.

Is this actually true? Which database does not do that? Afaik the most common databases have either no deadlock detection at all or automatically rollback the transaction if a deadlock is detected.

windaishi avatar May 08 '25 09:05 windaishi

@windaishi mysql doesn't do that in all scenarios (deadlock, timeout). For example when innodb_rollback_on_timeout is disabled (by default is false)

JoniJnm avatar May 08 '25 09:05 JoniJnm

@JoniJnm Seems like the setting you linked only applies to lock wait timeouts. The problem in this issue is only related to deadlocks deadlocks. Afaik this are two different error cases with different exceptions.

windaishi avatar May 08 '25 09:05 windaishi

@windaishi sorry, I did not add "maybe".

As I see - DeadlockException is thrown in \Doctrine\DBAL\Driver\API\PostgreSQL\ExceptionConverter and \Doctrine\DBAL\Driver\API\MySQL\ExceptionConverter.

MySQL documentation states that

InnoDB detects the condition and rolls back one of the transactions (the victim)

I did not find documentation how PostgreSQL behaves in such case, so I can't exclude different behavior.

h1k3r avatar May 08 '25 09:05 h1k3r

@windaishi no, it's the same case: when there is an exception (deadlock or timeout) mysql itself can close the transaction but doctrine still continues managing it and the savepoints.

(see https://github.com/doctrine/dbal/issues/6651#issuecomment-2636919570)

JoniJnm avatar May 08 '25 09:05 JoniJnm

@JoniJnm Ah okay, I get it. The problem also exists for timeouts, but is hard to fix in this case, because it depends on the configuration, right?

But for deadlocks it can be fixed reliably when we are sure that all databases will rollback the transaction on a deadlock

@h1k3r I think this describes the behavior for PostgreSQL: https://www.postgresql.org/docs/current/explicit-locking.html#LOCKING-DEADLOCKS

PostgreSQL automatically detects deadlock situations and resolves them by aborting one of the transactions involved, allowing the other(s) to complete. (Exactly which transaction will be aborted is difficult to predict and should not be relied upon.)

windaishi avatar May 08 '25 10:05 windaishi