sqlite-simple icon indicating copy to clipboard operation
sqlite-simple copied to clipboard

Incorrect implementation of `withTransaction`?

Open mvoidex opened this issue 7 years ago • 12 comments
trafficstars

mask $ \restore -> do
    begin
    r <- restore action `onException` rollback
    commit
    return r

https://github.com/nurpax/sqlite-simple/blob/c2f216aee5f41964e1d5018bdb2c73a3c8de3d2c/Database/SQLite/Simple.hs#L499

The problem is that commit can throw (synchronous) exception itself (ErrorBusy and ErrorLocked) leaving us inside transaction. But at least in case of ErrorBusy it can be retried.

mask $ \restore → do
    begin
    (restore act <* commit) `onException` rollback

mvoidex avatar Jan 03 '18 21:01 mvoidex

I'm running into this as well. Also, AFAIK the recommended way to deal with SQLITE_BUSY in multithreaded applications is to retry. How about adding new combinators withConnectionRetry, withTransactionRetry, etc. that deal with SQLITE_BUSY using an exponential backoff strategy?

I have some code ready to go if the project thinks this is a good idea. =^D

GambolingPangolin avatar Apr 25 '19 18:04 GambolingPangolin

ouch, this looks important for humane db engineering

is there an active maintainer atm. And or how can perhaps haskell trustees or clc and friends help out? cc @sclv @bodigrim @emilypi

cartazio avatar Feb 26 '20 22:02 cartazio

@cartazio I'm active maintainer atm. I'll get to it in next release, handling 8.8 support now.

sigrlami avatar Feb 27 '20 11:02 sigrlami

Ok cool. Do let us know how we can help though. Sqlite simple is a systematically important package and it always helps to bounce ideas off other folks for improving engineering :)

On Thu, Feb 27, 2020 at 6:27 AM Sergey Bushnyak [email protected] wrote:

@cartazio https://github.com/cartazio I'm active maintainer atm. I'll get to it in next release, handling 8.8 support now.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/nurpax/sqlite-simple/issues/57?email_source=notifications&email_token=AAABBQXB7WPYOPEII5P7RQLRE6PSZA5CNFSM4EKJBVQKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOENEAH2I#issuecomment-591922153, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAABBQR7MJTCEQXCOM2Z74LRE6PSZANCNFSM4EKJBVQA .

cartazio avatar Feb 27 '20 13:02 cartazio

I'm running into this issue when withTransaction throws ErrorBusy.

I've tried implementing retrying (see code below), but this results in the error SQLite3 returned ErrorMisuse while attempting to perform prepare "ROLLBACK TRANSACTION": bad parameter or other API misuse.

Retry code:

{-# LANGUAGE ScopedTypeVariables #-}

import qualified Database.SQLite.Simple as Sqlite
import Control.Concurrent (threadDelay)
import Control.Exception (catch, throwIO)


withTransactionRetry :: forall a. Sqlite.Connection -> IO a -> IO a
withTransactionRetry conn io =
    go 0
  where
    go attempt = Sqlite.withTransaction conn io `catch` catchSqliteError attempt
    catchSqliteError :: Int -> Sqlite.SQLError -> IO a
    catchSqliteError attempt err
        | Sqlite.sqlError err == Sqlite.ErrorBusy = do
            putStrLn $ "Database busy. Retrying in " ++ show pauseSeconds ++ "s."
            threadDelay (pauseSeconds * 1000000)
            if attempt < numMaxAttempts
                then go (attempt+1)
                else throwIO err
        | otherwise = throwIO err

pauseSeconds = 1
numMaxAttempts = 20

runeksvendsen avatar Oct 07 '20 14:10 runeksvendsen

I'm seeing a wide range of errors running the above code. Here are the errors I've seen so far (one error per line). Line number two seems most worrisome as it looks like some sort of pointer bug.

SQLite3 returned ErrorBusy while attempting to perform close: not an error
zddrctzmsqltzm2zi3zi26zm7260f045zdDatabaseziSQLite3ziBindingszddrctzzmsqltzzm2zzi3zzi26zzm7260f045zuDatabasezziSQLite3zziBindingszumkCTraceCallback: interrupted
SQLite3 returned ErrorError while attempting to perform step: cannot start a transaction within a transaction

runeksvendsen avatar Oct 07 '20 15:10 runeksvendsen

@runeksvendsen what version of sqlite3 used?

sigrlami avatar Oct 07 '20 15:10 sigrlami

I'm not certain how to tell which version is used. I'm on MacOS (version 10.14.6 (18G6032)), and I assume /usr/lib/libsqlite3.dylib is used, although I also seem to have an sqlite version installed by Homebrew in /usr/local/opt/sqlite/lib/libsqlite3.0.dylib. Here's some info on /usr/lib/libsqlite3.dylib:

$ otool -L /usr/lib/libsqlite3.dylib
/usr/lib/libsqlite3.dylib:
	/usr/lib/libsqlite3.dylib (compatibility version 9.0.0, current version 274.26.0)
	/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1252.250.1)

and for /usr/local/opt/sqlite/lib/libsqlite3.0.dylib:

$ otool -L /usr/local/opt/sqlite/lib/libsqlite3.0.dylib
/usr/local/opt/sqlite/lib/libsqlite3.0.dylib:
	/usr/local/opt/sqlite/lib/libsqlite3.0.dylib (compatibility version 9.0.0, current version 9.6.0)
	/usr/lib/libz.1.dylib (compatibility version 1.0.0, current version 1.2.11)
	/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1252.250.1)

Let me know if you need more information.

runeksvendsen avatar Oct 07 '20 17:10 runeksvendsen

@runeksvendsen ok, I'll check on Mac additionally

sigrlami avatar Oct 07 '20 18:10 sigrlami

@runeksvendsen the reason you're seeing "cannot start a transaction within a transaction" is because of the nature of this issue -- when the commit at the end of withTransaction throws an exception, you end up with the transaction left open on that connection. Then a subsequent retry tries to do a BEGIN on same connection and it blows up.

You can't fix this by wrapping withTransaction in retries, a fix to withTransaction itself is needed.

I wrote a fixed version that looks like this (I actually used bracketOnError_ from the safe-exceptions package, but you get the idea). @mvoidex's solution at the top probably works too but I'm scared of mask/restore.

withSqliteTransaction :: SQLite.Connection -> IO a -> IO a
withSqliteTransaction conn action = do
  bracketOnError begin (const rollback) $ \_ -> do
    ret <- action
    commit
    return ret

  where
    begin = SQLite.execute_ conn "BEGIN TRANSACTION"
    commit = SQLite.execute_ conn "COMMIT TRANSACTION"
    rollback = SQLite.execute_ conn "ROLLBACK TRANSACTION"

In my application, I then wrap this in retry code similar to yours.

Any chance we could get some maintainer attention on this? It's a pretty major correctness bug IMO and easy to trigger when using concurrent connections.

thomasjm avatar Mar 22 '21 23:03 thomasjm

Bump. @sigrlami any progress on this? Again, do let us know if you'd like us to submit a patch.

runeksvendsen avatar Jan 24 '22 13:01 runeksvendsen

@runeksvendsen I'll let you know next couple of days, I have some time to work on this and do more testing.

sigrlami avatar Jan 24 '22 16:01 sigrlami