snowflake-connector-nodejs Out of memory issue

I am running into an out of memory issue when trying to stream rows from a table with 3 million rows. Does the snowflake stream support the highWaterMark and back pressure functionality of streams? It seems like it should take little memory to stream 10 rows at a time from the db.

const snowflake = require('snowflake-sdk');
const config = require('config');
const through2 = require('through2');

async function test() {
  var connection = snowflake.createConnection({
      account: config.snowflake.account,
      username: config.snowflake.username,
      password: config.snowflake.password
    }
  );

  const conn = await new Promise((resolve, reject) => {
    connection.connect((err, c) => (err ? reject(err) : resolve(c)));
  });

  var statement = conn.execute({
    sqlText: 'SELECT * FROM TABLE’
  });

  var stream = statement.streamRows();
  stream.pipe(through2.obj({highWaterMark: 5}, function (chunk, enc, next) {
    // console.log(chunk);
    next();
  }));
}

test();

Output

<--- Last few GCs --->

[5282:0x103801600]   119235 ms: Scavenge 1382.8 (1419.3) -> 1382.1 (1419.8) MB, 4.6 / 0.0 ms  (average mu = 0.167, current mu = 0.068) allocation failure 
[5282:0x103801600]   119244 ms: Scavenge 1383.2 (1419.8) -> 1382.5 (1420.3) MB, 4.3 / 0.0 ms  (average mu = 0.167, current mu = 0.068) allocation failure 
[5282:0x103801600]   120061 ms: Mark-sweep 1383.6 (1420.3) -> 1382.9 (1420.3) MB, 812.5 / 0.0 ms  (average mu = 0.100, current mu = 0.030) allocation failure scavenge might not succeed


<--- JS stacktrace --->

==== JS stack trace =========================================

    0: ExitFrame [pc: 0x23988115be3d]
    1: StubFrame [pc: 0x2398811134b0]
Security context: 0x3459e929e6e1 <JSObject>
    2: extractFromRow [0x345902014769] [/Users/s/git/riversjs/node_modules/snowflake-sdk/lib/connection/result/column.js:~709] [pc=0x23988116e1f3](this=0x345945be4eb9 <Column map = 0x34590b7387b1>,row=0x3459467bd621 <Object map = 0x3459b262f599>,context=0x3459f8ee0899 <Object map = 0x34590b74b3c9>,asString=0x3...

FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
 1: 0x10003ae75 node::Abort() [/Users/s/.nvm/versions/node/v10.12.0/bin/node]
 2: 0x10003b07f node::OnFatalError(char const*, char const*) [/Users/s/.nvm/versions/node/v10.12.0/bin/node]
 3: 0x1001a6a85 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [/Users/s/.nvm/versions/node/v10.12.0/bin/node]
 4: 0x100574422 v8::internal::Heap::FatalProcessOutOfMemory(char const*) [/Users/s/.nvm/versions/node/v10.12.0/bin/node]

Jun 25 '19 16:06 stenou

need investigation.

Jun 29 '19 17:06 smtakeda

We actually encountered something similar several months ago. I did some pretty extensive testing at the time to figure out what the problem was, and it appears it was the chunk size that is being sent from snowflake. I don't remember all of the details, but basically snowflake will give a list of all of the available chunks to download, so as you stream the rows, it will download each sequential chunk. The problem is that the chunk size (which is determined by snowflake as far as I can tell) seems to more or less double in size until it hits some threshold. So what'd happen is we'd end up with something like -

Chunk 1: 1mb
Chunk 2: 2mb
Chunk 3: 4mb
Chunk 4: 8mb

... and when the snowflake client is parsing these larger chunks, the node client eventually exhausts all of its memory. The way that we got around it was that we had to use result_scan and iterate over the original query with a smaller pagination size (in our case, batching by 1,000,000 rows didn't generate chunks large enough to cause any out of memory errors).

With that said, I don't know for certain if that's what's happening here, but the symptoms do seem to be very much related, so wanted to post my observations here.

Sep 30 '19 16:09 thegrumpysnail

Hello @thegrumpysnail, seems I have the same problem in my project. I would really appreciate for more verbose example with result_scan. Thanks in advance!

Jun 01 '20 09:06 balasphilip

I am also facing similar issues. Is there any workaround for it. I would rather use streaming to access all the data instead of having to manually batch up the calls.

Jan 03 '23 19:01 kumaresansv

I'm also noticing unreasonably big memory usage when streaming. I've tried to work around this using result_scan but it didn't help because to use result_scan you still need to execute the query first and it seems like there is just no way to execute a query without fetching the results. Even if you just do connection.execute({ sqlText: "SELECT * FROM LARGE_TABLE" }) without providing any callback it still seems to fetch results under the hood (judging by the resource usage and OOM kills).

Am I missing something? How can I work around that?

Apr 12 '23 16:04 TimShilov

@TimShilov - it's been ages, so the details are very fuzzy in my memory, but basically we had to initially generate a result using result_scan, and instead of streaming the results of the result_scan, we'd select an offset / limit from the result_scan using the query ID returned. So you're effectively doing what snowflake is trying to do behind the scenes, except in a way that doesn't kill your service.

Apr 13 '23 00:04 thegrumpysnail

@thegrumpysnail Thanks for the response. That's basically what I'm trying to achieve. What I don't understand is how to get the queryId in the first place? I'm trying to:

Execute a heavy query and get queryId (without fetching the result).
Iterate over the results of that query using RESULT_SCAN(<queryId>)

The problem is step 1 fails due to high memory usage because the results of a heavy query are still fetched (even though I don't need them).

Apr 13 '23 16:04 TimShilov

We have also encountered this issue. I do not at the moment know very much about the conditions, except that it happened when we attempted to upgrade the version of snowflake-sdk from 1.6.14 to 1.6.21. Since this is only known to have happened when running against a customer warehouse, it's not entirely simple to bisect (if I had a repro case against our own warehouse I'd bisect and identify which specific version upgrade introduced the problem).

<--- Last few GCs --->
[14:0x4f816d0] 1126668 ms: Mark-sweep (reduce) 4945.1 (5472.8) -> 4944.9 (5472.8) MB, 5738.1 / 0.5 ms (average mu = 0.160, current mu = 0.017) allocation failure GC in old space requested
[14:0x4f816d0] 1132076 ms: Mark-sweep (reduce) 4945.5 (5472.8) -> 4945.1 (5473.1) MB, 5343.8 / 0.5 ms (average mu = 0.094, current mu = 0.012) allocation failure GC in old space requested
<--- JS stacktrace --->
FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory
1: 0xb06730 node::Abort() [/usr/local/bin/node]
2: 0xa1b6d0 [/usr/local/bin/node]
3: 0xce1e60 v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [/usr/local/bin/node]
4: 0xce2207 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [/usr/local/bin/node]
5: 0xe99875 [/usr/local/bin/node]
6: 0xea953d v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [/usr/local/bin/node]
7: 0xeac23e v8::internal::Heap::AllocateRawWithRetryOrFailSlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [/usr/local/bin/node]
8: 0x121bd3d v8::internal::Deserializer<v8::internal::Isolate>::Allocate(v8::internal::SnapshotSpace, int, v8::internal::AllocationAlignment) [/usr/local/bin/node]
9: 0x121ffa3 v8::internal::Deserializer<v8::internal::Isolate>::ReadObject(v8::internal::SnapshotSpace) [/usr/local/bin/node]
10: 0x121f58c int v8::internal::Deserializer<v8::internal::Isolate>::ReadSingleBytecodeData<v8::internal::SlotAccessorForHeapObject>(unsigned char, v8::internal::SlotAccessorForHeapObject) [/usr/local/bin/node]
11: 0x121fed1 v8::internal::Deserializer<v8::internal::Isolate>::ReadData(v8::internal::Handle<v8::internal::HeapObject>, int, int) [/usr/local/bin/node]
12: 0x122002e v8::internal::Deserializer<v8::internal::Isolate>::ReadObject(v8::internal::SnapshotSpace) [/usr/local/bin/node]
13: 0x121e960 v8::internal::Deserializer<v8::internal::Isolate>::ReadObject() [/usr/local/bin/node]
14: 0x1219218 v8::internal::ContextDeserializer::Deserialize(v8::internal::Isolate*, v8::internal::Handle<v8::internal::JSGlobalProxy>, v8::DeserializeInternalFieldsCallback) [/usr/local/bin/node]
15: 0x1219396 v8::internal::ContextDeserializer::DeserializeContext(v8::internal::Isolate*, v8::internal::SnapshotData const*, bool, v8::internal::Handle<v8::internal::JSGlobalProxy>, v8::DeserializeInternalFieldsCallback) [/usr/local/bin/node]
16: 0x123ea08 v8::internal::Snapshot::NewContextFromSnapshot(v8::internal::Isolate*, v8::internal::Handle<v8::internal::JSGlobalProxy>, unsigned long, v8::DeserializeInternalFieldsCallback) [/usr/local/bin/node]
17: 0xf4d705 v8::internal::Genesis::Genesis(v8::internal::Isolate*, v8::internal::MaybeHandle<v8::internal::JSGlobalProxy>, v8::Local<v8::ObjectTemplate>, unsigned long, v8::DeserializeInternalFieldsCallback, v8::MicrotaskQueue*) [/usr/local/bin/node]
18: 0xf4e538 v8::internal::Bootstrapper::CreateEnvironment(v8::internal::MaybeHandle<v8::internal::JSGlobalProxy>, v8::Local<v8::ObjectTemplate>, v8::ExtensionConfiguration*, unsigned long, v8::DeserializeInternalFieldsCallback, v8::MicrotaskQueue*) [/usr/local/bin/node]
19: 0xcef696 v8::NewContext(v8::Isolate*, v8::ExtensionConfiguration*, v8::MaybeLocal<v8::ObjectTemplate>, v8::MaybeLocal<v8::Value>, unsigned long, v8::DeserializeInternalFieldsCallback, v8::MicrotaskQueue*) [/usr/local/bin/node]
20: 0xcefb64 v8::Context::New(v8::Isolate*, v8::ExtensionConfiguration*, v8::MaybeLocal<v8::ObjectTemplate>, v8::MaybeLocal<v8::Value>, v8::DeserializeInternalFieldsCallback, v8::MicrotaskQueue*) [/usr/local/bin/node]
21: 0xaf68f1 node::contextify::ContextifyContext::CreateV8Context(node::Environment*, v8::Local<v8::Object>, node::contextify::ContextOptions const&) [/usr/local/bin/node]
22: 0xaf76d7 node::contextify::ContextifyContext::ContextifyContext(node::Environment*, v8::Local<v8::Object>, node::contextify::ContextOptions const&) [/usr/local/bin/node]
23: 0xaf7cd9 node::contextify::ContextifyContext::MakeContext(v8::FunctionCallbackInfo<v8::Value> const&) [/usr/local/bin/node]
24: 0x156ed0c [/usr/local/bin/node]
Child process exited with code null

Apr 25 '23 03:04 haggholm

quick update: team is looking into implementing the highwatermark / backpressure functionality . will link the PR once available.

Apr 25 '23 06:04 sfc-gh-dszmolka

It may be a separate issue but decided to leave it here cause the symptoms are similar. Since the update to version 1.6.21 the OOM problems got worse. In our case processes started crashing more frequently even on relatively small responses. Just downgrading to 1.6.20 with no other changes to the code fixed the crashes. I assume that the latest patch introduced some memory leak or something. 😩

cc @sfc-gh-dszmolka

May 04 '23 17:05 TimShilov

there hasn't been any changes implemented (yet) in context with the request detailed in this issue #43 (highWaterMark / adding backpressure capability for resultset streams)

please open a new Issue for what you experience. If you can please add more details on the reproduction, that would surely help troubleshooting faster. thank you in advance !

May 04 '23 17:05 sfc-gh-dszmolka

quick update: team is looking into implementing the highwatermark / backpressure functionality . will link the PR once available.

Is there any progress on this front?

May 24 '23 18:05 haggholm

there is. work already started but got reproritized due to other more critical bugs which are now fixed so we resume working on this one. Hope to be able to provide an update by mid-June

May 25 '23 06:05 sfc-gh-dszmolka

#505

May 30 '23 16:05 sfc-gh-dszmolka

We're also running into this issue. It would be nice to have some control over it.

Jun 06 '23 18:06 jdgcj

PR is now merged into main and will be part of the next release, 1.6.23. Will comment again once release is available; expected towards end of June.

Jun 13 '23 17:06 sfc-gh-dszmolka

thank you all for bearing with us - the long-awaited improvement of the connector supporting backpressure functionality is now out with release 1.6.23 ! some important notes though to add, and we'll amend the official documentation too but until then, here it is:

most importantly, it can be toggled with streamResult in Connection, which is by default false. For any result set that could exceed node's default memory, it is highly recommended to set streamResult: true especially if you know already you'll be streaming the results anyway. With the default setting false, the connector will attempt to store all the rows in an array before streaming the results. With smaller result sets, this been never an issue but with larger result sets, storing all the results in memory can contribute to an OOM error.
the backpressure mechanism is automatic and only takes effect when the stream has reached the highWaterMark threshold
example code snippet of usage, please note the streamResult in Connection:

var connection = snowflake.createConnection({
    account: process.env.SFACCOUNT,
    username: process.env.SFUSER,
..
    streamResult: true
});
[..rest of the code..]
connection.execute({
  sqlText: " select L_COMMENT from SNOWFLAKE_SAMPLE_DATA.TPCH_SF100.LINEITEM limit 100000000;",
  streamResult: true,
  complete: function (err, stmt)
  {
    var stream = stmt.streamRows();
    stream.on('readable', function (row) // Read data from the stream when it is available
    {
      let row;

      // Before the change, the amount of data in the stream is possibly greater than the highWaterMark threshold
      // After the change, the amount of data in the stream will be lesser or equal than the threshold
      while ((row = this.read()) !== null) 
      {
        console.log(row);
      }
    }).on('end', function ()
    {
      console.log('done');
    }).on('error', function (err)
    {
      console.log(err);
    });
  }
});

Jun 15 '23 06:06 sfc-gh-dszmolka

I have attempted to use this as per below. unfortunatly round the 49,000 record IStill get Javascript heap out of memory

snowflakeInteractionsStream
        .on(
          'readable',
          function () 
          {
            let row;
            while ((row = this.read()) !== null) {
              count++;
              if (count % 500 === 0) {
                console.log(`Rows ${count}`);
              }
            
            }
          },
        )
        .on('end', function () {

          console.log(`The End`);
          resolve();
        })
        .on('error', function (err) {
          console.log(err);
        });

Jun 23 '23 18:06 azriel46d

can you please include a

full runnable script which leads to the issue with all configuration you're using
table definition
actual dataset or a mock dataset which represents the data in the table

asking this to confirm whether you're still hitting the same issue which was fixed with introducing backpressure and streamResult: true, or perhaps you're hitting https://github.com/snowflakedb/snowflake-connector-nodejs/issues/220

if you don't wish to share it here, please open a case with Snowflake Support and mention this Issue. You can then work 1:1 with a support engineer if you don't wish to do that in public.

Jun 24 '23 09:06 sfc-gh-dszmolka

@sfc-gh-dszmolka

Here is the full runnable script. First you have a function which returns the runnable stream as the result of a promise. A query is then made. In terms of the tabular structure its limited in terms of the number of columns. So i do not think its linked to #220 .

I cannot supply the full results but can supply a sample and a total number of records round 500,000.

USER_ID,EVENT_TYPE,EVENT_ID,EVENT_DATE,USERNAME,METADATA
4b116319-efcf-4674-93c6-4ed16524b331,New account,new-account-876014695332-823034494,2019-04-29,user001,"{
  ""account_no"": ""01220304055"",
  ""ccy"": ""EUR"",
  ""scheme_code"": ""CRA123""
}"
02786cf3-0d4e-43b1-b202-ba85c1bb9291,New account,new-account-876014695433-823034495,2019-04-29,user002,"{
  ""account_no"": ""998989898989"",
  ""ccy"": ""EUR"",
  ""scheme_code"": ""CRA123""
}"
bbf13761-8fa5-47e4-98b6-6f2b436f84f6,New account,new-account-876014695534-823034496,2019-04-28,user003,"{
  ""account_no"": ""8786868686868"",
  ""ccy"": ""EUR"",
  ""scheme_code"": ""CRA123""
}"

const getStatementStream = function (query): Promise<internal.Readable> {
  return new Promise((resolve, reject) => {
    const statement = this.connection.execute({
      sqlText: query,
      streamResult: true,
      complete: function (err, statement: Statement) {
        if (err) {
          reject(err);
        } else {
          const stream = statement.streamRows();
          resolve(stream);
        }
      },
    });
  });
}

const snowflakeInteractionsStream = getStatementStream(`SELECT
    user_id,
    event_type,
    event_id,
    event_date,
    username, 
    OBJECT_CONSTRUCT(
      'scheme_code', code, 
      'account_no',  account_id, 
      'ccy', currency_code
    ) as metadata  
FROM accounts_table`);
snowflakeInteractionsStream
  .on(
    'readable',
    function () // row, // Read data from the stream when it is available
    {
      let row;
      while ((row = this.read()) !== null) {
        count++;
        if (count % 500 === 0) {
          console.log(`new ${interaction} interactions: ${count}`);
        }

      }
    },
  )
  .on('end', function () {
    console.log(`finished ${interaction} interactions: ${count}`);
    resolve();
  })
  .on('error', function (err) {
    console.log(err);
    reject();
  });

Jun 26 '23 09:06 azriel46d

thank you for providing the snippet and the rest of the details ! i managed to reproduce the issue and realized that

in my above comment I forgot to include the necessary configuration on Connection. this is now included in the example.
still with streamResult set to true on the Connection, the execution still breaks with JavaScript heap out of memory

until I reverted #465 :

     ....
      //ret = betterEval("(" + rawColumnValue + ")");
      ret = eval("(" + rawColumnValue + ")");

So using the exact same setup for this reproduction and test data (which is around 9million rows from the above 3 in your example), even with or without streamResult set to true on the Connection, it works now and doesn't break.

So I believe what you're seeing now is more likely connected to issues brought in by #465 , for which we have multiple Issues open (#528 , #539)

Jun 26 '23 10:06 sfc-gh-dszmolka

Thanks for the quick update. Something small i noticed is that @types/snowflake-sdk is not updated as well to reflect the connection settings change . I can confirm with reverting betterEval it does not go out of memory for the above example. I have also tried with a more complex setup in which it sinks into a local DB and it works flawlessly.

Jun 26 '23 11:06 azriel46d

can confirm, @types/snowflake-sdk is not managed by Snowflake at this moment

Jun 26 '23 11:06 sfc-gh-dszmolka

For others in the same boat as me, it's worth noting that before you can rely on the streaming OOM fixes in 1.6.23, you may want to wait on:

https://github.com/snowflakedb/snowflake-connector-nodejs/issues/528

Jun 28 '23 15:06 APTy

snowflake-connector-nodejs snowflake-connector-nodejs copied to clipboard

Out of memory issue

snowflake-connector-nodejs
snowflake-connector-nodejs copied to clipboard