liteq
liteq copied to clipboard
Still issues with high concurrency
#25 is a nice feature, but still doesn't fix issues with high concurrency (probably #23 too), it just postpones them, because nothing prevents the user from calling publish several times in a row. Here's an example. In my machine, I consistently get this:
library(liteq)
task <- function(db) {
library(liteq)
q <- ensure_queue("jobs", db = db)
msg <- consume(q)
out <- msg$title
Sys.sleep(10)
ack(msg)
out
}
db <- tempfile()
q <- ensure_queue("jobs", db = db)
n <- 10
workers <- replicate(n, callr::r_bg(task, list(db=db)))
publish(q, letters[1:(n/2)], rep("something", n/2))
Sys.sleep(3)
publish(q, letters[1:(n/2)], rep("something", n/2))
#> Error: database is locked
while (nrow(print(list_messages(q))))
Sys.sleep(0.2)
#> Error: database is locked
I think that the problem is that not all writers acquire the lock. Both ensure_queue and publish are writers. So we have two points of failure: (1) concurrent calls to ensure_queue when workers are starting up and (2) concurrent calls to publish + consume. Then, concurrent calls to db_lock seem to be problematic too. Maybe this function should perform some kind of exponential back-off?
Regarding the last issue, I think that the problem is the use of BEGIN EXCLUSIVE, which produces deadlocks. All exclusive locks should be BEGIN IMMEDIATE instead.
@Enchufa2 This should fix our issues: https://github.com/r-dbi/RSQLite/pull/345