worker icon indicating copy to clipboard operation
worker copied to clipboard

Can isolation level be configured?

Open liujimj opened this issue 2 years ago • 5 comments

Summary

I'd like to switch my entire application database, which includes the graphile-worker tables to SERIALIZABLE isolation in order to take advantage of simpler mental model it requires, e.g. by running ALTER TABLE SET default_transaction_isolation ...

Does graphile-worker play well with stricter isolation levels, since SKIP LOCKED and SERIALIZABLE are in some sense redundant? If not, is there a way to configure graphile-worker to force use of the default isolation level?

Context

I found a reference claiming that SERIALIZABLE and SKIP LOCKED should not be used together, as performance is poor:

As far as I can see, it's philosophically opposed to SERIALIZABLE because it implies that you are using explicit row locks in the first place. That shouldn't be necessary under SERIALIZABLE, or we have failed. Philosophy aside, there is a more practical problem: the rows that you skip are still predicate-locked, so create conflicts among all the competing transactions. You lock different rows but only one concurrent transaction ever manages to complete, and you waste a lot of energy retrying.

This forces a choice between SERIALIZABLE and SKIP LOCKED, not because the features exclude each other but because the resulting performance is terrible.

If the claim above is correct, then graphile-worker ought to reset the isolation level itself (per-session or per-transaction) if the database defaults have been changed. This could potentially be exposed it as a configuration parameter as well.

A cursory look didn't reveal any mechanism to set isolation levels. If that's the case, users like myself who want higher transaction isolation on application tables would need to rely on setting per-session or per-transaction isolation within the app. That's quite doable, but somewhat cumbersome.

liujimj avatar Jun 28 '22 06:06 liujimj

We don’t explicitly set transaction isolation level; but I’m not aware of a reason why it would not work. Performance would indeed be worse, but I don’t know how much worse - why not give it a go with the benchmarking scripts?

benjie avatar Jun 28 '22 08:06 benjie

Would you accept a PR to add a transaction_isolation parameter to the JS functions, perhaps to SharedWorkerOptions ?

liujimj avatar Jun 28 '22 20:06 liujimj

I think the only time we ever use an explicit transaction is when we're performing the migrations, so I'm not sure what benefit doing this would give?

benjie avatar Jun 29 '22 09:06 benjie

My apologies, I failed to explain myself well above.

  1. I'd like to use default_transaction_isolation = SERIALIZABLE so that my own application, which shares the DB with graphile-worker, may avoid explicitly setting isolation on each session / transaction.

  2. If no explicit transaction is provided, the implicit transaction1 that PostgreSQL creates for single statements applies default_transaction_isolation, including for all graphile-worker queries.

  3. The reference above suggests that SKIP LOCKED + SERIALIZABLE will perform poorly, leading to high contention in getJob and perhaps addJob

  4. I'd like to suggest graphile-worker's JS libraries have the option to override the DB's default isolation, via either explicit transaction blocks or during pool initialization. This won't extend to the SQL functions, since users easily wrap them in a transaction themselves.

Reference

1 https://www.postgresql.org/docs/current/tutorial-transactions.html

PostgreSQL actually treats every SQL statement as being executed within a transaction. If you do not issue a BEGIN command, then each individual statement has an implicit BEGIN and (if successful) COMMIT wrapped around it.

liujimj avatar Jun 30 '22 07:06 liujimj

You can do that already by using Graphile Worker in library mode and passing your own pool; something like:

const pool = new pg.Pool({ connectionString: "..." });

function noop(){}
pool.on('error', noop);
pool.on('connect', client => {
  client.on('error', noop);
  client.query('set default_transaction_isolation = ...');
});

await run({
  pgPool: pool,
  ...
});

benjie avatar Jun 30 '22 08:06 benjie

The suggestion above worked for our use case.

As an aside, we found that custom pools skip graphile-worker's default release logic, so a couple extra lines were necessary:

const pool = new pg.Pool({ connectionString: "..." });

... // Error handler setup

// Emulate graphile-worker's default release logic
const events = new EventEmitter().on('stop', () => { pool.end(); });

await run({
  pgPool: pool,
  events,
  ...
});

liujimj avatar Jan 16 '23 05:01 liujimj