Node monetdb too slow
Describe your problem
I am using node monetdb 2.1.0 and noticed that monetdb expands the size of a buffer very often which slow the execution of the query.
I didn't have this problem on version 1.3.3
Can monetdb allocate a larger size for the buffer to avoid the expand ?
Describe the solution you'd like to see implemented
Able to execute query faster
Additional context
No response
Hello @Anthonytla , Please provide a reproduction script. Also, what MonetDB server version are you running ?
Hello,
It's hard to share the code with you because of my company policy, but what's in the code is simple. It creates a new Connection(options) and run execute(query) on it. But there are a large amount of data, the buffer keep expanding and it slow down the query. I executed the same query with monetdb 1.3.3 and it tooks 30 sec instead of 8 minutes in the 2.1.0 version
I can share with you my screenshot . I am using Node 20.16.0 server to run monetdb.
Thanks for the report @Anthonytla
Can you please test this commit https://github.com/MonetDB/monetdb-nodejs/commit/b0bfb84ce8ab9d37c669fa38368fae0adde3f052 with your query. I've increased the buffer size increment to multiple of mapi blk size. If all is good I can make new release.
Hello,
It's still very slow, 16 min againts 30 s with monetdb 1.3.3. I will try to make a script with the test monetdb and give it to you.
Hello,
Here are 2 scripts run monetdb 1.3.3 and 2.1.0 and a sql script. I put dummies data on the db to simulate the same amount of data on my company database. The execution time is faster since they are not my company real data but there is still a big difference between the two versions.
//monetdb 1.3.3
var MDB = require('monetdb')();
var dbConfig = {
host: 'localhost',
port: 50000,
dbname: 'mydb',
user: 'monetdb',
password: 'monetdb',
defaultSchema: 'test',
};
var conn = new MDB(dbConfig);
conn.connect()
.then(() => {
console.log('Connected to MonetDB');
// Measure time for SELECT query
const start = Date.now();
return conn.query("SELECT col1, col2 FROM exemple_table")
.then(result => {
const duration = Date.now() - start;
console.log(`Query executed in ${duration} ms`);
return conn.close();
});
})
.catch(err => {
console.error('Error:', err);
conn.close();
});
// monetdb 2.1.0
const { Connection } = require('monetdb');
const dbConfig = {
host: 'localhost',
port: 50000,
database: 'mydb',
username: 'monetdb',
password: 'monetdb',
defaultSchema: 'test'
};
async function main() {
let conn;
try {
// Connect to the database
conn = new Connection(dbConfig);
console.log("Connected to MonetDB");
await conn.connect();
const start = Date.now();
conn.setReplySize(-1)
const result = await conn.execute("SELECT col1, col2 FROM test.exemple_table");
const duration = Date.now() - start;
console.log(`Query executed in ${duration} ms`);
} catch (err) {
console.error("Error:", err);
} finally {
if (conn) {
await conn.close();
console.log("Disconnected");
}
}
}
main();
-- -- fill_table.sql
CREATE TABLE IF NOT EXISTS test.exemple_table (
col1 VARCHAR(175),
col2 VARCHAR(255)
);
CREATE OR REPLACE PROCEDURE test.fill_table()
BEGIN
DECLARE i INT;
SET i = 1;
WHILE i <= 300000 DO
INSERT INTO test.exemple_table VALUES (
REPEAT('A', 75),
REPEAT('B', 100)
);
SET i = i + 1;
END WHILE;
END;
CALL test.fill_table();
Thank you for providing a test script. We will be looking into providing a patch soon.
@Anthonytla
I've made small refactor to handle large results better. Can you please test on your end as well? Here is the timing on my machine running the example you've provided
Connected to MonetDB
Query executed in 3430 ms
Server has ended the connection
Disconnected
real 0m3.667s
user 0m4.491s
sys 0m0.856s
If all is good I will make another release
Hello ,
Thanks for the correction, I tested your commit in 3 machines,one with 32 Gb RAM and the others two 16Gb RAM. I managed to get the same duration for the query (3260ms) for the 32 Gb machine and the first 16Gb machine. But on the second 16Gb machine which is also my work laptop, it runs out of memory. Here is my work laptop spec:
HP CPU:2 × AMD Athlon Silver 3050U with Radeon Graphics GPU: AMD Radeon Graphics
The problem comes from this laptop, but I don't know what the problem is. If you have any ideas, please tell me.
Thanks
My guess is that on your work laptop you are running out of the default heap size for V8 . You can adjust the default heap size with --max-old-space-size option like so
node --max-old-space-size=4096 app.js
I ran you script while gathering some memory statistics and got the following
{
rss: '52.55 MB',
heapUsed: '5.34 MB',
heapTotal: '8.27 MB',
external: '1.86 MB',
arrayBuffers: '0.02 MB'
}
Heap size limit: 4.046875 GB
Connected to MonetDB
Query executed in 2783 ms
{
rss: '1681.57 MB',
heapUsed: '1517.82 MB',
heapTotal: '1565.06 MB',
external: '55.05 MB',
arrayBuffers: '53.24 MB'
}
Heap size limit: 4.046875 GB
Server has ended the connection
Disconnected
I've used this function to collect the memory statistics
import v8 from 'node:v8';
function memStat () {
const toMB = bytes => (bytes / 1024 / 1024).toFixed(2) + ' MB';
const mem = process.memoryUsage();
console.log({
rss: toMB(mem.rss),
heapUsed: toMB(mem.heapUsed),
heapTotal: toMB(mem.heapTotal),
external: toMB(mem.external),
arrayBuffers: toMB(mem.arrayBuffers),
});
const { heap_size_limit } = v8.getHeapStatistics();
const heapSizeInGB = heap_size_limit / (1024 * 1024 * 1024);
console.log(`Heap size: ${heapSizeInGB} GB`);
}
You are right, that was the problem. But I don't understand why there are that much differences in memory usage on the two laptops while running the same code and a same node version 16.
Here are the mem stat
Home laptop:
{
rss: '1667.81 MB',
heapUsed: '1517.38 MB',
heapTotal: '1571.15 MB',
external: '53.72 MB',
arrayBuffers: '53.25 MB'
}
Heap size limit:2.046875 GB
Work laptop
{
rss: '4872.26 MB',
heapUsed: '4544.76 MB',
heapTotal: '4644.93 MB',
external: '160.17 MB',
arrayBuffers: '159.71 MB'
}
Heap size limit: 9.812500 GB
I'm not sure why you are getting that much difference between the laptops either. I do assume you are fetching the same result size. Correct ?
No, I realise that I had executed the sql script more than once, which makes more data on the db. Thanks, I got the same stats on both laptops now. I am going to test on the real data and let you know.
@Anthonytla I hope numbers work out better for you now and you are able to run your queries faster. I've published new 2.2.0 version on npm. If you don't have any more remarks I believe we can close this issue.
Yes thank you very much, I close the issue.