snowflake-connector-nodejs icon indicating copy to clipboard operation
snowflake-connector-nodejs copied to clipboard

SNOW-1858341: limit size of data responses

Open mx2323 opened this issue 3 years ago • 8 comments

currently with the node.js snowflake-sdk, if you do a SELECT * FROM LARGE_TABLE, the node.js snowflake client can crash the entire process if even one row is too large in the result set.

we would like to be able to set an option that limits the amount of data that is held in memory for each request, the sdk could either throw an exception, or gracefully return some subset of received data.

without this, we are UNABLE to prevent any user from overloading the calling node process and causing OOM errors.

note: this is still an issue, even with streaming rows, because that single row may still be too large.

mx2323 avatar Oct 28 '21 21:10 mx2323

+1

ghenkhaus avatar Oct 28 '21 21:10 ghenkhaus

hi, thank you for submitting this issue. we'll take a look how this could be handled.

sfc-gh-dszmolka avatar Jan 23 '23 13:01 sfc-gh-dszmolka

in the meantime , as a possible workaround, setting max_old_space_size like NODE_OPTIONS=--max_old_space_size=4096 (https://stackoverflow.com/questions/38558989/node-js-heap-out-of-memory)

could help mitigate the issue. probably you guys are already aware and using something similar as a workaround, but still leaving it here in case anyone new stumbles into this issue

sfc-gh-dszmolka avatar Jan 18 '24 08:01 sfc-gh-dszmolka

Seeking help @mx2323 @ghenkhaus @sfc-gh-dszmolka @sfc-gh-jfan and others Tried out this sample code from https://docs.snowflake.com/en/developer-guide/sql-api/submitting-requests but because my data set size is between 6-7MB, it is failing with message Request to S3/Blob failed

`// Load the Snowflake Node.js driver.
var snowflake = require('snowflake-sdk');
// Create a Connection object that we can use later to connect.
var connection = snowflake.createConnection({
    account: "MY_SF_ACCOUNT",
    database: "MY_DB",
    schema: "MY_SCHEMA",
    warehouse: "MY_WH",
    username: "MY_USER",
    password: "MY_PWD"
});
// Try to connect to Snowflake, and check whether the connection was successful.
connection.connect( 
    function(err, conn) {
        if (err) {
            console.error('Unable to connect: ' + err.message);
            } 
        else {
            console.log('Successfully connected to Snowflake.');
            // Optional: store the connection ID.
            connection_ID = conn.getId();
            }
    }
);

var statement = connection.execute({
  sqlText: "Select * from LargeDataSet limit 100",
//sqlText: "Select * from LargeDataSet", -- fails with Request to S3/Blob failed
  complete: function(err, stmt, rows) {
    if (err) {
      console.error('Failed to execute statement due to the following error: ' + err.message);
    } else {
      console.log('Successfully executed statement: ' + stmt.getSqlText());
    }
  }
});`

We are observing this while upgrading the snowflake-sdk from 1.6.23 to ^1.9.0. Things seem to be working fine with version 1.6.*, 1.7.0 and 1.8.0

Is there a resolution for fetching large data sets?

bhaskarbanerjee avatar Feb 08 '24 14:02 bhaskarbanerjee

hi @bhaskarbanerjee this issue you're seeing is not related to the original one, which is a feature/improvement request for something which doesn't exist yet. Let's keep this Issue for what it was originally intended for; tracking the original improvement request.

since small result sets work for you, only bigger have problems fetching, i would suspect the host you're running snowflake-sdk on, cannot reach the Snowflake internal stage (= S3 bucket) on which the query results are temporarily stored.

to fix this, I recommend running select system$allowlist() in your Snowflake account (perhaps on the GUI), and double confirming all of the endpoints listed as STAGE are in fact reachable from the host on which you have this problem. You can even use SnowCD to perform an automated test.

If you confirmed nothing blocks the connectivity to the stage and it still doesn't work, kindly open a new issue here or open a Snowflake Support case and we can help further.

sfc-gh-dszmolka avatar Feb 08 '24 14:02 sfc-gh-dszmolka

Thanks @sfc-gh-dszmolka let me try that but if it is a server side problem, then why does v1.6.*-1.8.0 work as a charm for large data set of 6-7 MB. EXACT Same query not working with ^1.9.0 of the sdk

bhaskarbanerjee avatar Feb 08 '24 15:02 bhaskarbanerjee

Verified. We have 2 VPCs listed there and both are set to type= 'STAGE'. @sfc-gh-dszmolka

bhaskarbanerjee avatar Feb 08 '24 15:02 bhaskarbanerjee

Ran snowcd tool and msg="Check clear. No error presented." for both the VPCs listed there

bhaskarbanerjee avatar Feb 08 '24 15:02 bhaskarbanerjee