yugabyte-db icon indicating copy to clipboard operation
yugabyte-db copied to clipboard

[YSQL] Improved wait queue based pessimistic locking

Open rkarthik007 opened this issue 4 years ago • 11 comments

Jira Link: DB-1529

In many scenarios, pessimistic locking is desirable to have the DB wait for existing transactions (that might otherwise conflict) to complete before making progress.

Pessimistic locking semantics are already part of READ COMMITTED isolation level. This is done via internal statement retries (in the query layer) with exponential backoff when a transaction detects conflicting transactions. A wait queue based pessimistic locking implementation will ensure lower latencies and be more resource efficient since it would eliminate statement retries.

Also, the wait queue based implementation should be agnostic to the isolation level i.e., it should ensure pessimistic locking semantics for all isolation levels.

Design doc - https://docs.google.com/document/d/1DCg4FjGlYZNBBIPcADk7z2B2TGznNHWtr8hU-0dUXyo/edit

TODO: Convert this doc to a .md file once the implementation has been completed.

Functional requirements

  • #9589
  • #9590
  • #13590
  • #13582
  • #14014
  • #13584
  • #13894
  • #13211
  • #13585

Nice-to-have

  • #13576
  • #13577
  • #13578
  • #13580
  • #13586
  • #14114
  • #14165

Analytics

rkarthik007 avatar Sep 14 '20 22:09 rkarthik007

@rkarthik007 Have a plan or scheme to support pessimistic locking ?

triump2020 avatar Mar 02 '21 15:03 triump2020

Simple test:

CREATE TABLE t (i int);
INSERT INTO t VALUES (1);
bin/psql   --command 'update t set i = 2' & bin/psql   --command 'update t set i = 3' # both succeed
bin/ysqlsh --command 'update t set i = 2' & bin/ysqlsh --command 'update t set i = 3' # one fails

jaki avatar Apr 13 '21 21:04 jaki

What is the expected timeline for this feature to be implemented? Is it still supposed to be shipped in April with 2.7 release?

We really need this feature in place, before trying out yugabyte.

salarali avatar Apr 19 '21 21:04 salarali

hey @salarali , thanks for the interest in Yugabyte!

can you describe a bit your use case, to better understand why this is a blocker for you? cc @rkarthik007

bmatican avatar Apr 19 '21 21:04 bmatican

Thanks for the quick reply.

We have two applications that are touching the same record at the same time. We want the first transaction to finish and then handle the second transaction.

salarali avatar Apr 19 '21 21:04 salarali

@salarali is this a relatively isolated transaction, that you could retry from the client side? say, if you knew that the error code you got on the client was specifically a conflict?

we've actually not seen this be a significant blocker, so would really like to understand better if there's some easy workaround you could be ok with, to still try out Yugabyte, or if there's something more fundamental, that could be great feedback for us and the product

bmatican avatar Apr 19 '21 21:04 bmatican

Yes, it is a relatively isolated transaction and we could retry from the client side after looking at the error code from the client.

Unfortunately, updating our code to support retries would require some engineering effort to update the pipelines.

salarali avatar Apr 19 '21 21:04 salarali

as agreed on the yugabyte slack i will duplicate the text here so it is marked down.

while trying to port over an existing application that runs on postgresql i run into the issue that i need the pessimistic locking to be supported in order to avoid rewriting the whole application.

i will describe a few scenarios here:

Example on the select for update use case: i have to guarantee that a customer X does not get more than 3 subscriptions in activate state on his/her name. As uncommited transactions do not see each other's data what i do currently with postgres to guarantee this: i have separate tables for customers and subscriptions, when ever i do a status change on any of the subscriptions i at first grab the consumer's record lock with select for update (with no real intention to update it at all, just to lock it), do my operations & checks on the other records as necessary, release the lock with commit at transaction end. By doing this my application layer can remain very simple and very understandable to developers, and since i don't have a programmatic pattern to access subscriptions without the lock i do not have to lock or worry about locks on any of the subscription records. I also don't have to retry possible complex computations over and over again.

Another example on the parallel updates to the same record, postgresql has atomic increment support which is very useful for rapid parallel updates of numbers in records

whatever=# create table increment_test(any_number int);
CREATE TABLE
whatever=# insert into increment_test values(1);
INSERT 0 1

now when i have a small script like this (pg_sleep added for ease of testing)

begin;
update increment_test set any_number = any_number + 1;
select pg_sleep(5);
commit;

when thes update scripts are executed in parallel on postgresql, numbers get incremented by all script instances that were run (e.g. 2 parallel executions wait for each other due to row being locked by update and the result after 2 increments is 3). when i execute the script with a parallelism of 2 on yugabytedb i get the 40001 error on one of the transactions and i would have to retry this myself. not an issue if i do it on one record and seldom, but a huge problem if you run this at big concurrency against a small set of records, i could be almost stuck in a while loop for retries from the application side.

i am sure more compatibility with postgresql behavior would be a huge factor to win over customers from postgresql.

kulminaator avatar May 21 '21 07:05 kulminaator

Supporting READ COMMITTED (PostgreSQL default) has also been a request so that existing applications have an easier time migrating from PostgreSQL to Yugabyte.

RunningJon avatar Jul 07 '21 18:07 RunningJon

@salarali @kulminaator in your use cases, what isolation level are you using?

pkj415 avatar Jul 14 '21 16:07 pkj415

@salarali @kulminaator in your use cases, what isolation level are you using?

Sorry for taking part in it, but in our case, we usually use READ COMMITED (Postgres default) like @RunningJon said before

dbuenor avatar Jul 14 '21 20:07 dbuenor

FYI -- we have moved towards the nomenclature of "wait-on-conflict" (FKA pessimistic locking) vs. "fail-on-conflict" (FKA optimistic locking). Issue details have been updated accordingly

robertsami avatar Jan 05 '23 19:01 robertsami

Closing this as we are using the board https://github.com/orgs/yugabyte/projects/19 to track the open items. All the necessary items have landed. cc @robertsami

rthallamko3 avatar Mar 19 '24 15:03 rthallamko3