monetdb-nodejs icon indicating copy to clipboard operation
monetdb-nodejs copied to clipboard

Node monetdb too slow

Open Anthonytla opened this issue 9 months ago • 6 comments

Describe your problem

I am using node monetdb 2.1.0 and noticed that monetdb expands the size of a buffer very often which slow the execution of the query.

I didn't have this problem on version 1.3.3

Can monetdb allocate a larger size for the buffer to avoid the expand ?

Describe the solution you'd like to see implemented

Able to execute query faster

Additional context

No response

Anthonytla avatar May 16 '25 09:05 Anthonytla

Hello @Anthonytla , Please provide a reproduction script. Also, what MonetDB server version are you running ?

lrpereira avatar May 16 '25 11:05 lrpereira

Hello,

It's hard to share the code with you because of my company policy, but what's in the code is simple. It creates a new Connection(options) and run execute(query) on it. But there are a large amount of data, the buffer keep expanding and it slow down the query. I executed the same query with monetdb 1.3.3 and it tooks 30 sec instead of 8 minutes in the 2.1.0 version

I can share with you my screenshot . I am using Node 20.16.0 server to run monetdb.

Image

Anthonytla avatar May 16 '25 12:05 Anthonytla

Thanks for the report @Anthonytla

Can you please test this commit https://github.com/MonetDB/monetdb-nodejs/commit/b0bfb84ce8ab9d37c669fa38368fae0adde3f052 with your query. I've increased the buffer size increment to multiple of mapi blk size. If all is good I can make new release.

sstalin avatar May 18 '25 07:05 sstalin

Hello,

It's still very slow, 16 min againts 30 s with monetdb 1.3.3. I will try to make a script with the test monetdb and give it to you.

Anthonytla avatar May 19 '25 15:05 Anthonytla

Hello,

Here are 2 scripts run monetdb 1.3.3 and 2.1.0 and a sql script. I put dummies data on the db to simulate the same amount of data on my company database. The execution time is faster since they are not my company real data but there is still a big difference between the two versions.

//monetdb 1.3.3

var MDB = require('monetdb')();

var dbConfig = {
    host: 'localhost',
    port: 50000,
    dbname: 'mydb',
    user: 'monetdb',
    password: 'monetdb',
    defaultSchema: 'test',
};

var conn = new MDB(dbConfig);
conn.connect()
    .then(() => {
        console.log('Connected to MonetDB');

        // Measure time for SELECT query
        const start = Date.now();

        return conn.query("SELECT col1, col2 FROM exemple_table")
            .then(result => {
                const duration = Date.now() - start;
                console.log(`Query executed in ${duration} ms`);

                return conn.close();
            });
    })
    .catch(err => {
        console.error('Error:', err);
        conn.close();
    });

// monetdb 2.1.0

const { Connection } = require('monetdb');

const dbConfig = {
    host: 'localhost',
    port: 50000,
    database: 'mydb',
    username: 'monetdb',
    password: 'monetdb',
    defaultSchema: 'test' 
};

async function main() {
    let conn;

    try {
        // Connect to the database
        conn = new Connection(dbConfig);
        console.log("Connected to MonetDB");

        await conn.connect();
        const start = Date.now();
       
        conn.setReplySize(-1)
        const result = await conn.execute("SELECT col1, col2 FROM test.exemple_table");
        const duration = Date.now() - start;
        console.log(`Query executed in ${duration} ms`);


    } catch (err) {
        console.error("Error:", err);
    } finally {
        if (conn) {
            await conn.close();
            console.log("Disconnected");
        }
    }
}

main();

-- -- fill_table.sql

CREATE TABLE IF NOT EXISTS test.exemple_table (
    col1 VARCHAR(175),
    col2 VARCHAR(255)
);


CREATE OR REPLACE PROCEDURE test.fill_table()
BEGIN
  DECLARE i INT;
  SET i = 1;
  WHILE i <= 300000 DO
    INSERT INTO test.exemple_table VALUES (
      REPEAT('A', 75),
      REPEAT('B', 100)
    );
    SET i = i + 1;
  END WHILE;
END;

CALL test.fill_table();

Anthonytla avatar May 20 '25 12:05 Anthonytla

Thank you for providing a test script. We will be looking into providing a patch soon.

sstalin avatar May 20 '25 16:05 sstalin

@Anthonytla

I've made small refactor to handle large results better. Can you please test on your end as well? Here is the timing on my machine running the example you've provided

Connected to MonetDB
Query executed in 3430 ms
Server has ended the connection
Disconnected

real	0m3.667s
user	0m4.491s
sys	0m0.856s

If all is good I will make another release

sstalin avatar May 28 '25 14:05 sstalin

Hello ,

Thanks for the correction, I tested your commit in 3 machines,one with 32 Gb RAM and the others two 16Gb RAM. I managed to get the same duration for the query (3260ms) for the 32 Gb machine and the first 16Gb machine. But on the second 16Gb machine which is also my work laptop, it runs out of memory. Here is my work laptop spec:

HP CPU:2 × AMD Athlon Silver 3050U with Radeon Graphics GPU: AMD Radeon Graphics

The problem comes from this laptop, but I don't know what the problem is. If you have any ideas, please tell me.

Thanks

Anthonytla avatar Jun 02 '25 10:06 Anthonytla

My guess is that on your work laptop you are running out of the default heap size for V8 . You can adjust the default heap size with --max-old-space-size option like so

node --max-old-space-size=4096 app.js

I ran you script while gathering some memory statistics and got the following

{
  rss: '52.55 MB',
  heapUsed: '5.34 MB',
  heapTotal: '8.27 MB',
  external: '1.86 MB',
  arrayBuffers: '0.02 MB'
}
Heap size limit: 4.046875 GB
Connected to MonetDB
Query executed in 2783 ms
{
  rss: '1681.57 MB',
  heapUsed: '1517.82 MB',
  heapTotal: '1565.06 MB',
  external: '55.05 MB',
  arrayBuffers: '53.24 MB'
}
Heap size limit: 4.046875 GB
Server has ended the connection
Disconnected

I've used this function to collect the memory statistics

import v8 from 'node:v8';

function memStat () {
  const toMB = bytes => (bytes / 1024 / 1024).toFixed(2) + ' MB';
  const mem = process.memoryUsage();
  console.log({
      rss: toMB(mem.rss),
      heapUsed: toMB(mem.heapUsed),
      heapTotal: toMB(mem.heapTotal),
      external: toMB(mem.external),
      arrayBuffers: toMB(mem.arrayBuffers),
  });
  const { heap_size_limit } = v8.getHeapStatistics();
  const heapSizeInGB = heap_size_limit / (1024 * 1024 * 1024);
  console.log(`Heap size: ${heapSizeInGB} GB`);
}

sstalin avatar Jun 02 '25 14:06 sstalin

You are right, that was the problem. But I don't understand why there are that much differences in memory usage on the two laptops while running the same code and a same node version 16.

Here are the mem stat

Home laptop:


{
  rss: '1667.81 MB',
  heapUsed: '1517.38 MB',
  heapTotal: '1571.15 MB',
  external: '53.72 MB',
  arrayBuffers: '53.25 MB'
}
Heap size limit:2.046875 GB

Work laptop

{
  rss: '4872.26 MB',
  heapUsed: '4544.76 MB',
  heapTotal: '4644.93 MB',
  external: '160.17 MB',
  arrayBuffers: '159.71 MB'
}
Heap size limit: 9.812500 GB

Anthonytla avatar Jun 02 '25 20:06 Anthonytla

I'm not sure why you are getting that much difference between the laptops either. I do assume you are fetching the same result size. Correct ?

sstalin avatar Jun 04 '25 13:06 sstalin

No, I realise that I had executed the sql script more than once, which makes more data on the db. Thanks, I got the same stats on both laptops now. I am going to test on the real data and let you know.

Anthonytla avatar Jun 05 '25 09:06 Anthonytla

@Anthonytla I hope numbers work out better for you now and you are able to run your queries faster. I've published new 2.2.0 version on npm. If you don't have any more remarks I believe we can close this issue.

sstalin avatar Jun 16 '25 09:06 sstalin

Yes thank you very much, I close the issue.

Anthonytla avatar Jun 16 '25 20:06 Anthonytla