better-sqlite3 icon indicating copy to clipboard operation
better-sqlite3 copied to clipboard

Interrupt transaction

Open jaimegmx8 opened this issue 3 years ago • 14 comments

Good day :)

How can we interrupt a long running transaction? My use case is that a transaction that's running longer than X milliseconds needs to be killed. Seems like the functionality is available on the database:

Interrupt A Long-Running Query https://www.sqlite.org/c3ref/interrupt.html

And based on some SO answer, the interface is available in the sqlite3 Python library, as conn.interrupt():

https://stackoverflow.com/questions/43240496/python-sqlite3-how-to-quickly-and-cleanly-interrupt-long-running-query-with-e

The docs for better-sqlite3 do not cover this feature, and indeed the API doesn't seem to be available.

Any ideas on how to get this done?

Thank you in advance, and be well.

jaimegmx8 avatar Mar 05 '21 04:03 jaimegmx8

Given that the API of better-sqlite3 is synchronous, how would an interrupt API on your end be used? Would you spawn a second thread that calls sqlite3_interrupt after a given time if the main thread didn't tell you that it's done? In this case you could also reverse your logic and have the long running transaction in a worker thread and kill the worker thread after a timeout. I'm doing something similar using https://github.com/piscinajs/piscina but maybe that's not the cleanest approach.

Prinzhorn avatar Mar 05 '21 09:03 Prinzhorn

That's a very good point, I'll be thinking of this approach, however I still feel this is not quite correct for our use case, and I'll elaborate on it.

The core of our code is very much synchronous, so much that we are using the Transaction as a mutex, so no client can use the database at the same time.

This is working well for us, and the only missing requirement is to cap the time a client can keep a lock of the database. Having a separate thread to interrupt sqlite, as you imagined, is what we envision.

jaimegmx8 avatar Mar 05 '21 17:03 jaimegmx8

How about a timeout option to db.transaction(), or db.transaction()(), that internally uses sqlite3_interrupt and throws?

spazmodius avatar Apr 07 '21 15:04 spazmodius

@spazmodius I personally would prefer a more generic approach, so that use-cases such as hitting a button (think DB Browser for SQLite) can cancel a transaction as well

Edit: But yeah, there comes that synchronous API back into play and why I just abort the worker thread :smile: . How's that button event ever going to reach anything.

Prinzhorn avatar Apr 07 '21 15:04 Prinzhorn

Heh, right. Even my suggestion would require using a timer that doesn't depend on the event loop 😕

spazmodius avatar Apr 07 '21 15:04 spazmodius

You can't send a database connection from one thread to another using Node.js worker threads, so how would you invoke the interrupt function from another thread?

It might be better to use a single-threaded approach via the progress handler.

JoshuaWise avatar May 05 '21 01:05 JoshuaWise

I'm also looking for a way to interrupt long running select statements, same use case as OP. I've tried launching the queries in their own worker thread and terminating the thread, but the thread doesn't exit immediately and the query still runs to completion.

Progress handlers seem like a powerful solution - but for a simple solution, why not a query timeout parameter on the connection or on individual queries, similar to the options.timeout for waiting on a locked db?

FreeTrade avatar Oct 15 '21 04:10 FreeTrade

but the thread doesn't exit immediately and the query still runs to completion.

Are you calling worker.terminate() from the parent or are you sending a message via postMessage() for the thread to exit itself? The second one cannot work because SQLite synchronously keeps the thread busy, the message will only be processed after its done. The first option works for me (via piscina and AbortController).

but for a simple solution, why not a query timeout parameter on the connection or on individual queries

How would that be implemented? See the discussion above, the query synchronously blocks the thread, how would the timeout be triggered?

Prinzhorn avatar Oct 15 '21 11:10 Prinzhorn

Are you calling worker.terminate() from the parent

Yes, but in nodejs, it might be the case that nodejs is failing to terminate properly - ref https://github.com/nodejs/help/issues/3332

How would that be implemented?

I'm not sure on the details, but I'm not necessarily looking to interrupt the query arbitrarily, rather my requirement would be satisified by having the get/all return an error where the query was taking longer than X ms. Similar to how an error is returned if it takes longer than X ms to connect to the db. I guess this would be handled in the C/C++ rather than in the JS.

FreeTrade avatar Oct 17 '21 05:10 FreeTrade

Yes, but in nodejs, it might be the case that nodejs is failing to terminate properly - ref nodejs/help#3332

That sounds unrelated, it's about child processes spawned inside worker threads. If you run a query in a worker thread and nothing else, it should always terminate. There are no additional processes involved. Not sure about Windows, but the described "bug" is a feature in UNIX (related https://github.com/nodejs/node/issues/13538#issuecomment-307799666). If you terminate a process all child processes of that process are re-assigned to the next parent all the way up to init. They are not automatically terminated as well. That's something I've learned the hard way this year with a child process running into an endless loop when the parent terminated because of a broken pipe (it caught EPIPE and tried to write the error to stdout, causing EPIPE again and again).

Similar to how an error is returned if it takes longer than X ms to connect to the db

That's a feature of SQLite that better-sqlite3 exposes https://sqlite.org/c3ref/busy_timeout.html Such a feature does not exist for queries. That's why this issue exists, because the feature needs to be implemented in better-sqlite3.

Prinzhorn avatar Oct 18 '21 07:10 Prinzhorn

That sounds unrelated, it's about child processes spawned inside worker threads.

Thank you. I'm not sure of the reason, but the worker thread is only terminating when the query has completed.

That's a feature of SQLite that better-sqlite3 exposes

Agreed, sqlite provides a different way to terminate queries, but better-sqlite3 has options on how to provide this functionality. One way would be to use time-out parameters - it couldn't just hand them direct to sqlite, but I guess would need to run a watchdog thread alongside the query. It's one option I'm suggesting, but I'd be happy with any way to reliably terminate a long-running query.

FreeTrade avatar Oct 19 '21 00:10 FreeTrade

Here is a minimal working example that shows that you can terminate a worker thread that is stuck in an endless loop in a SQLite query (the same should be true for any long running query). You can increase/decrease the setTimeout and will see that it takes shorter/longer to terminate.

const { Worker, isMainThread, parentPort } = require('worker_threads');

if (isMainThread) {
  const worker = new Worker(__filename);

  worker.on('message', (msg) => {
    console.log(msg);
  });

  worker.on('error', (err) => {
    console.error(err);
  });

  worker.on('exit', (code) => {
    console.log(`Worker exited with code ${code}`);
  });

  setTimeout(() => {
    worker.terminate();
  }, 1000);
} else {
  const Database = require('better-sqlite3');
  const db = new Database(':memory');

  db.function('loop', () => {
    while (1);
  });

  db.prepare('SELECT loop()').get();

  parentPort.postMessage('done');
}

If this does not work for you, then I'd remove SQLite and see if you can terminate an endless loop worker at all. If not, then this is a bug in Node.js on your specific platform. I'm on Ubuntu 20.04.

Prinzhorn avatar Oct 19 '21 07:10 Prinzhorn

Took me a little time to find time to try this out - but firstly thanks for your generous comment on this with the code sample.

I played around with it a little. I found I was able to get a loop worker to terminate in my code, indeed I was able to get

db.prepare('SELECT loop()').get();

to terminate - I suspect because it was hitting the js 'while(1)' statement, which it was able to terminate. However, when using the same code, but a regular real SQL statement that is taking a long time - no dice, it only terminates when it hits the next js statement.

FreeTrade avatar Oct 26 '21 05:10 FreeTrade

I suspect because it was hitting the js 'while(1)' statement, which it was able to terminate. However, when using the same code, but a regular real SQL statement that is taking a long time - no dice, it only terminates when it hits the next js statement.

Looks like that's how it works. The docs for terminate say:

Stop all JavaScript execution in the worker thread as soon as possible.

That means it cannot stop the native code while the query is running. For this to work the native code needs to be cooperative and periodically check if the thread should stop. At least from what I understand. The reason this is needed is to exit the thread in a clean state, it cannot simply be stopped.

The reason it works for me is because all my heavy queries are using user defined functions or virtual tables. That means it will jump back and forth between native code and the JavaScript vm regularly and the vm will cooperatively periodically check if the thread needs to stop and terminate itself.

I didn't research into if and how other native modules are doing that. It looks like there is not a lot of information about this.

Edit: However, exposing sqlite3_progress_handler to js would not solve my use-case. I am using worker threads because these are user-provided queries that the user can manually cancel via a button or that are implicitly cancelled when starting a new query (Electron app). It would be great if better-sqlite3 can be made cooperative via sqlite3_progress_handler in addition to exposing it. Assuming this is actually the way to go, but sqlite3_progress_handler on the native side should be the correct place to check if the thread needs to exit. But I can't find the API where Node.js exposes this. I'm just trying to put pieces together from my limited knowledge.

Edit2: I think bool Worker::is_stopped() (or bool Environment::is_stopping() ?) needs to be checked from looking at the Node.js source. But like I said it's hard to find information on this. But I don't know how to get a handle of the current thread in the native addon. That's about where my knowledge ends.

Edit3: For some reason I decided to look into that again and I think this is the play

// Do this periodically

Environment* env = Environment::GetCurrent(isolate);

if(!env->is_main_thread() && env->is_stopping()) {
    // Do something here to stop the thread / SQLite.
}

Prinzhorn avatar Oct 26 '21 10:10 Prinzhorn