Connections don't run queries concurrently
Not sure if this is WAD, a bug, or there is some other workaround here for doing connection pooling with DuckDB in nodejs...
Using an open database, I want to run some queries concurrently across different connections. Is there a way to accomplish this?
This test script shows that queries will not run concurrently, even if run on different connections:
var duckdb = require('duckdb');
var db = new duckdb.Database(':memory:');
const connA = db.connect();
const connB = db.connect();
async function fastQuery() {
return new Promise((resolve, reject) => {
const t = Date.now();
connA.all(`select 1 from range(1,10)`, (err, res) => {
if(err) reject(err);
else {
resolve(Date.now() - t)
}
})
})
}
async function slowQuery() {
return new Promise((resolve, reject) => {
const t = Date.now();
connB.all(`select max(i) from (select 1 as i from range(1,10000000000));`, (err, res) => {
if(err) reject(err);
else {
resolve(Date.now() - t)
}
})
})
}
async function test() {
console.log("Run fast query");
console.log("Fast query time (ms): ", await fastQuery());
console.log("Run slow query");
console.log("Slow query time (ms): ", await slowQuery());
console.log("Run slow and fast query concurrently");
slowQuery().then(s => console.log("Slow query time (ms): ", s));
fastQuery().then(f => console.log("Fast query time (ms): ", f));
}
test();
The fast query should take a few milliseconds, while the slow query should take a few seconds. If the slow and fast queries are kicked off at the same time, even on different connections, the slow query blocks the fast query from executing. This is what I get when running the script above on an M2 MBP:
Run fast query
Fast query time (ms): 2
Run slow query
Slow query time (ms): 4386
Run slow and fast query concurrently
Slow query time (ms): 4701
Fast query time (ms): 4701
If I run the fast query on a completely different Database handle, then there is no problem running the fast query concurrently with the slow query:
Run fast query
Fast query time (ms): 3
Run slow query
Slow query time (ms): 4381
Run slow and fast query concurrently
Fast query on different DB handle (ms): 1
Slow query time (ms): 4700
While this is presently by design, we should implement the parallelize and serialize methods to allow this to be configurable
I would argue the parallel execution should be the default way, if not, the entire premise of duckdb being super fast becomes irrelevant for node
I do searching for multiple query execution parallel execution through DuckDB but not finding solution, my scenerio is want to read parquet file from S3, we have member wise files no query is going to read same file so want to execute multiple member wise query as we have good system of 26GB ram and 20+ processesor but not finding solution for same.
If the connection don't run queries concurrently, does there any way to improve multiple reading performance?
While this is presently by design, we should implement the parallelize and serialize methods to allow this to be configurable
@Mause Could you check if this enhancement is planned for the development pipeline? It's crucial for performance reasons to support concurrent execution.