go-sqlite3 icon indicating copy to clipboard operation
go-sqlite3 copied to clipboard

Error "cannot start a transaction within a transaction" on cached connection

Open azavorotnii opened this issue 6 years ago • 3 comments

Assuming we run this piece of code:

			ctx, cancel := context.WithCancel(ctx)
			tx, err := db.BeginTx(ctx, nil)

it is possible to get at the same time context.Cancelled error and started transaction. This will happen only if we cancel context very fast.

This happens because context cancellation is inherently racy as performed not in same go-routine where statement is executed.

Proposed fix #765

azavorotnii avatar Dec 13 '19 22:12 azavorotnii

@azavorotnii If I understand your bug report correctly, this is not a bug with BeginTx specifically, but rather some faulty logic in (*SQLiteStmt).exec. https://github.com/mattn/go-sqlite3/blob/590d44c02bca83987d23f6eab75e6d0ddf95f644/sqlite3.go#L1893-L1905

The existing code erroneously concludes that if we called sqlite3_interrupt, then the pending operation will definitely fail so it doesn't bother looking at the result. This is flawed (1) because of the inherent raciness, and (2) because the SQLite docs themselves say that sqlite3_interrupt will not necessarily interrupt a pending operation. "If an SQL operation is very nearly finished at the time when sqlite3_interrupt() is called, then it might not have an opportunity to be interrupted and might continue to completion." https://www.sqlite.org/c3ref/interrupt.html

I believe the correct fix would be to not just discard what gets pulled from resultCh and instead only return ctx.Err() if we receive SQLITE_INTERRUPT from the actual operation.

I have also sent a question to the SQLite mailing list about the behavior when BEGIN gets interrupted to see if we need to explicitly roll back in that case or not.

ETA: The SQLite maintainers say that if you interrupt a call to BEGIN, the transaction is aborted. So calling ROLLBACK in this case is unnecessary (though harmless).

rittneje avatar Dec 16 '19 19:12 rittneje

I believe the correct fix would be to not just discard what gets pulled from resultCh and instead only return ctx.Err() if we receive SQLITE_INTERRUPT from the actual operation.

@rittneje that was my initial thought, but for some reason instead of SQLITE_INTERRUPT I got SQLITE_DONE. That could be race in my test setup though (not same test as in this PR). Let me check again.

azavorotnii avatar Dec 16 '19 20:12 azavorotnii

@rittneje done as you proposed. I did mistake in my initial test, now it should be stable.

azavorotnii avatar Dec 23 '19 20:12 azavorotnii