sqlx icon indicating copy to clipboard operation
sqlx copied to clipboard

Concurrency safe API that doesn't return stale data

Open DXist opened this issue 3 years ago • 4 comments

Is your feature request related to a problem? Please describe.

Current API gives users either:

  • stale reads when operating on transaction isolation level below serializable
  • serialization error that adds extra handling burden on application developer

Describe the solution you'd like

It's a combination of the API interface itself and quickstart documentation for beginner database users.

Transaction isolation level

Stale reads are not possible on "strong serializable" isolation level for distributed databases and "ANSI serializable" for single node databases. Operation on lower isolation levels requires application developer to deal with various anomalies and can result in security issues.

So 'serializable' isolation level is a must for the design goal of this feature. And the decreased transaction throughput is not the priority.

The quickstart documentation could show how to enable this level as session default for the supported DBMS. For example, for Postgresql it could be to ask user to connect to the desired database under the desired user and issue

ALTER USER current_user IN DATABASE current_catalog SET default_transaction_isolation = 'serializable';

No stale reads API

Transaction retries

On serializable isolation level the API should handle transaction retries so the library user could focus on his/her current tasks. The retry logic could be configured as parameters for "commit finalizer" logic that will allow power users to tune maximum number of retries and maybe retry strategy. The quickstart documentation for beginners should use the finalizer with defaults - the chosen max_retries number and exponential backoff with jitter strategy.

Example sleep duration between retries in milliseconds: sleep_ms = (2 ** retry) * 0.1 * (random.random() + 0.5).

Simpler finalizer with default strategy "commit finalizer" could have name commit_with_max_retries.

For Postgres case there are several error codes that could be retried:

  • 40000 - transaction_rollback - It may happen during statement preparation phase or, for example during large batched INSERT, I found description here - https://github.com/postgres/postgres/blob/f972ec5c285c3bc50d81f8044e08cd636017ab68/src/backend/replication/logical/reorderbuffer.c#L2540

  • 40001 - serialization_failure - the target error of this feature

  • 40P01 - deadlock_detected - may happen on commit or after deadlock_timeout. Can be eliminated by following consistent lock order. It's possible to log it as a warning and hint user to review transactions statement order.

Limit users to run side effects

Since transaction can be retried it's not safe to run side effects / non idempotent logic between the transaction queries. It means that the API should provide an interface that allows to run the transaction as a whole with potential retries and either get the desired consistent state of the applied transaction or to get SQLError (for example, IO/TLS error or constraint violation error).

How is it possible to limit side effects?

I think about interface like TryStream combinators. Let's call them "query combinators". The user is expected to glue transactional code using a chain of query combinators. The call chain starts with "transaction input capture" method that returns the structure that implements chainable interface trait. The "commit finalizer" is attached to the last query combinator and executes the constructed nested structure and handles retries.

Combinators accept move closure as a final argument. There are combinators both for non async and async closures.

Closures could call init logic on each transaction retry if it's need. It could be both non async and async.

The input state of transaction is a single value or a tuple that is captured by value at the start of call chain by "transaction input capture" method. It can include initial dummy values of transaction results. These dummies constrain the transaction output type.

The captured input is propagated through the call chain as the first parameter of combinators. It is modified and returned by the transaction pipeline - in similar fashion as TryStream::try_fold works.

The set of combinators mirrors existing sqlx interface. The most generic one is fetch-like query combinator that allows to process query results in streaming fashion within single query combinator closure. query_as<T>- like combinator is useful to collect the result into some specified type, execute-like - just let the user to check the query result that usually has the count of affected rows.

So the second query combinator parameter depends on the type of combinator and represents the result or result stream of the previous query combinator in the chain.

Non async closures return Result. Async closures return a future that resolves to Result.

The error type may be generic and allow user to interrupt transaction by returning custom error. The transaction error is an enum of either SQLError or custom user error. Closure returns a different enum to indicate an error - either custom error variant or the error for manual transaction rollback that finishes the transaction. It's possible to use custom error itself to indicate the rollback of whole transaction and special error variant with savepoint to name if savepoints are supported. Savepoints make transaction execution logic nonlinear. They could be implemented as special combinators that accept nested subtransaction.

The Ok variant of closure return type is a pair (accumulated_transaction_state, Option<Query>). Optionality allows to add conditional logic and make the transaction pipeline structure dynamic.

If there is Some(query) the accumulated state is propagated further through the call chain. If None - it means that the given state is the desired transaction output and it's time to commit transaction.

The transaction Ok variant value has the same type as captured initial state at the start of transaction.

This value doesn't have stale data. It could be processed by further user logic outside of transaction scope.

API extension to support both explicit and implicit transaction pipelines.

Postgres allows to pipeline multiple independent queries in a single batched call and receive results for all of them at once. The are called within implicit transaction.

It's possible to unify interface both for explicit and implicit (pipelined) transactions. Pipelined query combinators have differences:

  • it doesn't issue the next query because the next query doesn't depend on result processing state - for happy case it just returns accumulated_transaction_state
  • it can't rollback explicitly but could do some result validation and return custom error to abort the call chain earlier
  • "commit finalizer" processes them differently - it can collapse sequential implicit pipeline combinators into internal implicit pipeline structure. The structure will execute each relevant closures during the processing of the combined result set. In case of error it will throw away partial result - "BEGIN" and "COMMIT" statements are pipelined together with the first or the last implicit pipeline. It allows to execute a single implicit pipeline and these statements within a single query batch. CockroachDB will strip these statements and autoretry the implicit transaction when result set is under 16KiB. For larger result set pipeline will start client-side processing and may get serialization error. Then client-side retry mechanism is activated and pipeline is reexecuted.

In general case the transaction chain could be a single explicit transaction that could have multiple pipelined sections. Partial collapse is beneficial for these scenario too because less communication rounds is required to process the chain.

Query combinators documentation

It's important to help user to help to choose the correct combinators. The documentation may provide a flow chart with question paths that user should answer and pick the best combinator for particular case. The flow chart should prioritize implicit pipeline combinators when then next user query doesn't depend on results of the previous query. Then it could ask for execute-like, query_scalar-like, query_as-like, fetch_all-like combinators and finally suggest the most generic fetch-like query combinator.

DXist avatar Sep 02 '22 11:09 DXist

Do you know of any other database clients that implement something like this?

abonander avatar Sep 03 '22 00:09 abonander

@abonander , I've done a quick research and haven't found any SQL client or ORM projects that do client-side transaction retries.


I extended my research to any database interface that provides conversational transactions (in contrast to deterministic interface where all data to read and to write is known upfront before the start of transaction).

I've looked through Erlang Mnesia docs and found that it mentions retries:

Conflicting lock requests are automatically queued if there is no risk of a deadlock. Otherwise the transaction must be terminated and executed again. Mnesia does this automatically as long as the upper limit of the maximum retries is not reached. For details, see mnesia:transaction/3.

There is the following example of functional object passed as argument to transaction method:

bad_raise(Eno, Raise) ->
    F = fun() ->
                [E] = mnesia:read({employee, Eno}),
                Salary = E#employee.salary + Raise,
                New = E#employee{salary = Salary},
                io:format("Trying to write ... ~n", []),
                mnesia:write(New)
        end,
    mnesia:transaction(F).

I don't know whether it is possible to add conditional logic in this closure that depends on read operation results.

But database-side retry is possible because transaction executor runs Erlang logic and doesn't rely on intermediate non general purpose language like SQL.


Search through Gitlab issues by 'acidrain' pattern gave the following issue

It mentions closure-based interface.

DXist avatar Sep 04 '22 09:09 DXist

I think that SQL libraries/ORM authors usually provide an API with the assumptions:

  • database configuration uses default transaction isolation level like read_committed or repeatable_read
  • user could add SELECT FOR UPDATE queries to introduce locks when it's necessary (this will not work for concurrent SELECT FOR UPDATE and INSERT in Postgres)
  • user could reorder lock order of multiple resources to fix deadlocks

They usually don't design interface that is concurrency safe.

DXist avatar Sep 04 '22 09:09 DXist

Some power users may not need transaction retries and concurrency safe operation on serializable isolation level. These users could call a "commit finalizer" that accepts retry strategy parameter and could pass 'DoNotRetryStrategy' as the parameter's value. But those users could still enjoy performance benefits of pipelined execution in Postgres case and cancellation safety ensured by finalizer's RAII guard.

DXist avatar Sep 08 '22 12:09 DXist

@abonander , do you think that the proposed API layer could be a part of SQLx?

DXist avatar Sep 22 '22 14:09 DXist