powertools-lambda-typescript Bug: excessive memory usage when logging large amounts of logs

Expected Behavior

logger.info or any logger calls should not be using excessive memory usage with

Max Memory Used: 3339 MB

This is opposed to straight console.log, where we get

Max Memory Used: 83 MB

Current Behavior

See the executions logs.

When we run the simple code snippet with powertools logger logger.info, high memory usage is recorded.

Max Memory Used: 3339 MB

Code snippet

var serviceName = 'exampleService';
var logger = new Logger({ serviceName });
var handler = async (event, context) => {
  let largeObject = {
    prop1: 'prop',
    prop2: 'prop',
    prop3: 'prop',
    prop4: 'prop',
    prop5: 'prop',
    prop6: 'prop',
    prop7: 'prop',
    prop8: 'prop',
    prop9: 'prop',
    prop10: 'prop',
    prop11: 'prop',
    prop12: 'prop',
    prop13: 'prop',
    prop14: 'prop',
    prop15: 'prop',
    prop16: 'prop',
    prop17: 'prop',
    prop18: 'prop',
    prop19: 'prop',
    prop20: 'prop',
    prop21: 'prop',
    prop22: 'prop',
    prop23: 'prop',
    prop24: 'prop',
    prop25: 'prop',
    prop26: 'prop',
    prop27: 'prop',
    prop28: 'prop',
  };
  logger.info(JSON.stringify(event));
  logger.info(JSON.stringify(context));
  logger.info('hello world');

  for (let i = 0; i < 100000; i++) {
    logger.info(`hello world`, { largeObject });
    logger.info(`hello world`, { largeObject });
    logger.info(`hello world`, { largeObject });
    logger.info(`hello world`, { largeObject });
    logger.info(`hello world`, { largeObject });
    logger.info(`hello world`, { largeObject });
    logger.info(`hello world`, { largeObject });
    
    // console.log(`hello world`, { largeObject });
    // console.log(`hello world`, { largeObject });
    // console.log(`hello world`, { largeObject });
    // console.log(`hello world`, { largeObject });
    // console.log(`hello world`, { largeObject });
    // console.log(`hello world`, { largeObject });
    // console.log(`hello world`, { largeObject });

  }
  return 'done';
};

Steps to Reproduce

Entire handler code posted in Code snippet
Create lambda with runtime environment nodejs-18.x
install latest @aws-lambda-powertools/logger, and import
run code snippet with lambda

Possible Solution

Could this be related to the internal buffer powertools logger is setting? There is a buffer before the logger instance is initialized, but I am unsure what is happening during initialization.

Powertools for AWS Lambda (TypeScript) version

2.7.0

AWS Lambda function runtime

18.x

Packaging format used

npm

Execution logs

with powertools logger:

REPORT RequestId: <redacted> Duration: 29762.11 ms	Billed Duration: 29763 ms	Memory Size: 10240 MB	Max Memory Used: 3339 MB

with console.log:

REPORT RequestId: <redacted> Duration: 21149.55 ms	Billed Duration: 21150 ms	Memory Size: 10240 MB	Max Memory Used: 83 MB	Init Duration: 200.98 ms

Sep 13 '24 16:09 Dingway98

Thanks for opening your first issue here! We'll come back to you as soon as we can. In the meantime, check out the #typescript channel on our Powertools for AWS Lambda Discord: Invite link

Sep 13 '24 16:09 boring-cyborg[bot]

Hi @Dingway98, thank you for opening the issue.

Please allow us some time to triage the issue and try reproducing the behavior.

I'll get back to you here as soon as I have news to share.

Sep 13 '24 17:09 dreamorosi

To help narrow things down for you I went back to version 0.2.0 (way back) and this for the same function produced a normal amount of memory usage: Max Memory Used: 125 MB.

But this happens in both the latest v1 and latest v2, so something was added in v1/pre-v1 that is causing this.

Edit: it is somewhere between 0.7.2 (good) and 0.8.1 (bad) Edit: From my testing 0.8.1 is the culprit with PR https://github.com/aws-powertools/powertools-lambda-typescript/pull/748/files

Sep 13 '24 18:09 ckifer

This does not look an issue on powertools, but rather on the AWS Image for Node, if we replace the logger to:

import { Console } from 'console';

const logger = new Console({
  stdout: process.stdout,
  stderr: process.stderr,
})

We still have the memory issue.

A quick fix for powertools is just to remove the reference for Console and use directly the global console object.

Edit: I tried to test that sample of code on node locally but I didn't see any memory issue, I also tried to take a look at the nodejs runtime and they only do a simple log patch but didn't modify/use the Console object.

Edit2: On node 20.x we also have the error with memory.

Sep 14 '24 01:09 H4ad

I did some investigation and looks like the issue is related to the stdout and stderr not being synchronous and the code not giving the event loop time necessary to flush the logs.

A heap snapshot showed the WritableState was holding all the memory:

This data is being buffered because the logs when using Console and stdout are not blocking, and they are being buffered until the next tick when they are flushed.

If you add something like await new Promise(resolve => setTimeout(resolve, 2)); just to give to Node.js some time to flush the logs, you will not see the memory issue.

Also, you can make the stdout and stderr synchronous by putting the following code:

if (process.stdout._handle) process.stdout._handle.setBlocking(true);
if (process.stderr._handle) process.stderr._handle.setBlocking(true);

Anyway, feel free to choose between one of these solutions, I can open I PR when you choose how do you want to proceed.

Sep 14 '24 02:09 H4ad

Hi @H4ad, thank you SO much for this investigation and for also providing recommendations!

I've been trying to understand the implications of setting the process.stdout._handle to blocking and from the little info I could find on it in other discussions the consensus seems to be fairly split and in many cases leaning towards it having a detrimental impact on performance:

https://github.com/nodejs/node/issues/11568
https://github.com/nodejs/node/issues/6379

While I can verify that setting these two lines drops the memory usage to similar levels to the regular console.log, I'd like to spend a bit more time investigating and understanding the root cause & solution a bit better.

To try isolating the issue, I created two new test cases, neither of which uses the Powertools for AWS Logger and that still reproduce the issue. The main difference in both these tests is that we are stringifying the log similar to how Logger would.

In sample 1, I am using the patched console.log as it appears in the Node.js RIC here:

const serviceName = 'exampleService';

export const handler = (event: unknown, context: unknown) => {
  const largeObject = {
    prop1: 'prop',
    prop2: 'prop',
    prop3: 'prop',
    prop4: 'prop',
    prop5: 'prop',
    prop6: 'prop',
    prop7: 'prop',
    prop8: 'prop',
    prop9: 'prop',
    prop10: 'prop',
    prop11: 'prop',
    prop12: 'prop',
    prop13: 'prop',
    prop14: 'prop',
    prop15: 'prop',
    prop16: 'prop',
    prop17: 'prop',
    prop18: 'prop',
    prop19: 'prop',
    prop20: 'prop',
    prop21: 'prop',
    prop22: 'prop',
    prop23: 'prop',
    prop24: 'prop',
    prop25: 'prop',
    prop26: 'prop',
    prop27: 'prop',
    prop28: 'prop',
  };
  console.log(JSON.stringify(event));
  console.log(JSON.stringify(context));
  console.log('hello world');

  for (let i = 0; i < 100000; i++) {
    const line = JSON.stringify({
      timestamp: new Date().toISOString(),
      level: 'INFO',
      service: serviceName,
      message: 'hello world',
      ...largeObject,
    });
    console.log(line);
  }
  return 'done';
};

In sample 2, I am instead printing directly to process.stdout:

const serviceName = 'exampleService';

export const handler = (event: unknown, context: unknown) => {
  const largeObject = {
    prop1: 'prop',
    prop2: 'prop',
    prop3: 'prop',
    prop4: 'prop',
    prop5: 'prop',
    prop6: 'prop',
    prop7: 'prop',
    prop8: 'prop',
    prop9: 'prop',
    prop10: 'prop',
    prop11: 'prop',
    prop12: 'prop',
    prop13: 'prop',
    prop14: 'prop',
    prop15: 'prop',
    prop16: 'prop',
    prop17: 'prop',
    prop18: 'prop',
    prop19: 'prop',
    prop20: 'prop',
    prop21: 'prop',
    prop22: 'prop',
    prop23: 'prop',
    prop24: 'prop',
    prop25: 'prop',
    prop26: 'prop',
    prop27: 'prop',
    prop28: 'prop',
  };

  for (let i = 0; i < 100000; i++) {
    let line = JSON.stringify({
      timestamp: new Date().toISOString(),
      level: 'INFO',
      service: serviceName,
      message: 'hello world',
      ...largeObject,
    });
    line = line.replace(/\n/g, '\r');
    process.stdout.write(`${line}\n`);
  }
  return 'done';
};

I then tested both:

sample 1 (console.log) - Duration: 2308.49 ms Billed Duration: 2309 ms Memory Size: 10240 MB Max Memory Used: 77 MB Init Duration: 203.04 ms
sample 2 (process.stdout.write) - Duration: 1585.19 ms Billed Duration: 1586 ms Memory Size: 10240 MB Max Memory Used: 374 MB Init Duration: 213.40 ms

Both these tests have shorter memory usage and duration from the ones in the OP because I have reduced the amount of logs being printed by 1/7th, however we can still see that when using process.stdout.write the memory usage is several orders of magnitude higher than when using console.log.

Setting the process.stdout._handle to blocking in the second sample removes the memory usage and brings it to a similar level to the console.log sample.

This data is being buffered because the logs when using Console and stdout are not blocking, and they are being buffered until the next tick when they are flushed.

This makes sense, but then why is this line not affected since the patched console.log also writes to stdout??

I'm going to reach out to the Lambda team to see if they can look into this and have an explanation on why we're seeing this behavior.

Sep 16 '24 14:09 dreamorosi

This makes sense, but then why is this line not affected since the patched console.log also writes to stdout??

Good question, I read the source code of Node and I didn't found a difference between using Console or using process.stdout.write or console.log.

This difference seems to be something particular only to AWS, maybe the implementation is patched again in another part of the code.

Sep 17 '24 16:09 H4ad

Hi @H4ad! Thanks a lot for looking into this issue and giving us some direction to fix it. We are working with Lambda engineers to understand the root cause of this issue because as you mentioned, it seems like something very specific to the node embedded in the Lambda node image.

Você é o cara! ❤️

Sep 17 '24 19:09 leandrodamascena

Hello,

We're still waiting on the internal team from AWS Lambda to provide a path forward within the runtime. They are aware of the issue and have already started looking into it.

We are in close contact with them and will share more info here as soon as we can.

Sep 27 '24 14:09 dreamorosi

Any updates on this @dreamorosi?

Nov 25 '24 20:11 ckifer

Hello, no update - team's been busy with new runtime launches and re:Invent.

I'll try following up with the engineers most likely after next week.

Nov 25 '24 23:11 dreamorosi

Makes sense, thanks!

Nov 25 '24 23:11 ckifer

Makes sense, thanks!

Nov 25 '24 23:11 ckifer

Reached out to the Lambda team again, no significant update yet besides the fact that this is on their backlog.

I'm considering enabling the workaround/fix proposed above by default but I am not sure if it'll have a negative impact.

Basically in this method, add these two lines:

private setConsole(): void {
  if (!this.getEnvVarsService().isDevMode()) {
    this.console = new Console({
      stdout: process.stdout,
      stderr: process.stderr,
    });
+   if (process.stdout._handle) process.stdout._handle.setBlocking(true);
+   if (process.stderr._handle) process.stderr._handle.setBlocking(true);
  } else {
    this.console = console;
  }

  /**
   * Patch `console.trace` to avoid printing a stack trace and aligning with AWS Lambda behavior - see #2902
   */
  this.console.trace = (message: string, ...optionalParams: unknown[]) => {
    this.console.log(message, ...optionalParams);
  };
}

Based on my tests this drops the memory usage to the same amount of using the patched console.log.

Feb 11 '25 16:02 dreamorosi

Thanks for following up @dreamorosi! Would be willing to test it or patch our packages with this change if you want some confidence

Feb 11 '25 16:02 ckifer

I have been testing the mitigation above and while it works in my limited testing, I am not sure it's the right path forward.

Ultimately setting the stdout operation to blocking is too risky to roll out to customers. I am not 100% sure of the implications of doing this, but based on the reading I've done depending on the frequency and size of logging it could cause bckpressure in the event loop and even end up blocking/pausing the execution until the buffer drains. Which is not good.

Additionally, the _handle method we'd be using appears to be an internal implementation detail (aka not part of the public API), which makes adoption risky, since there's no guarantee that it'll behave the same or even still be there in future runtime updates.

I am still advocating for the issue to be fixed in the runtime or for a more stable alternative.

As a follow up from the Lambda runtime team, I was informed that my initial assessment of how the Node.js managed runtime works was incorrect and that logs are emitted using something like writeFileSync(1, line+ '\n', { flag: 'a' }); when using the globally patched console.log.

I did some tests and while this solves the issue, it seems to cause other kinds of back pressure errors when logging very frequently:

ERROR   Invoke Error    {"errorType":"Error","errorMessage":"EAGAIN: resource temporarily unavailable, write","code":"EAGAIN","errno":-11,"syscall":"write","stack":["Error: EAGAIN: resource temporarily unavailable, write","    at Object.writeSync (node:fs:924:3)","    at writeFileSync (node:fs:2446:26)","    at console.info (file:///var/task/index.mjs:4:23)","    at t.printLog (file:///var/task/index.mjs:9:6521)","    at t.processLogItem (file:///var/task/index.mjs:9:6975)","    at t.info (file:///var/task/index.mjs:9:2216)","    at Runtime.ii [as handler] (file:///var/task/index.mjs:9:11902)","    at Runtime.handleOnceNonStreaming (file:///var/runtime/index.mjs:1173:29)"]}

I'll need to investigate more, and will continue to follow up with the team.

May 13 '25 08:05 dreamorosi

Instead of set the handle to block, instead you could implement a buffer and flush every X bytes synchronously, and also schedule it to flush on the next tick (to ensure it won't pile up indefinitely)

This will ensure you will not do sync writes everytime and also will prevent OOM.

This is the strategy for pinojs using https://github.com/pinojs/sonic-boom

May 13 '25 14:05 H4ad