Bug: excessive memory usage when logging large amounts of logs
Expected Behavior
logger.info or any logger calls should not be using excessive memory usage with
Max Memory Used: 3339 MB
This is opposed to straight console.log, where we get
Max Memory Used: 83 MB
Current Behavior
See the executions logs.
When we run the simple code snippet with powertools logger logger.info, high memory usage is recorded.
Max Memory Used: 3339 MB
Code snippet
var serviceName = 'exampleService';
var logger = new Logger({ serviceName });
var handler = async (event, context) => {
let largeObject = {
prop1: 'prop',
prop2: 'prop',
prop3: 'prop',
prop4: 'prop',
prop5: 'prop',
prop6: 'prop',
prop7: 'prop',
prop8: 'prop',
prop9: 'prop',
prop10: 'prop',
prop11: 'prop',
prop12: 'prop',
prop13: 'prop',
prop14: 'prop',
prop15: 'prop',
prop16: 'prop',
prop17: 'prop',
prop18: 'prop',
prop19: 'prop',
prop20: 'prop',
prop21: 'prop',
prop22: 'prop',
prop23: 'prop',
prop24: 'prop',
prop25: 'prop',
prop26: 'prop',
prop27: 'prop',
prop28: 'prop',
};
logger.info(JSON.stringify(event));
logger.info(JSON.stringify(context));
logger.info('hello world');
for (let i = 0; i < 100000; i++) {
logger.info(`hello world`, { largeObject });
logger.info(`hello world`, { largeObject });
logger.info(`hello world`, { largeObject });
logger.info(`hello world`, { largeObject });
logger.info(`hello world`, { largeObject });
logger.info(`hello world`, { largeObject });
logger.info(`hello world`, { largeObject });
// console.log(`hello world`, { largeObject });
// console.log(`hello world`, { largeObject });
// console.log(`hello world`, { largeObject });
// console.log(`hello world`, { largeObject });
// console.log(`hello world`, { largeObject });
// console.log(`hello world`, { largeObject });
// console.log(`hello world`, { largeObject });
}
return 'done';
};
Steps to Reproduce
- Entire handler code posted in
Code snippet - Create lambda with runtime environment nodejs-18.x
- install latest
@aws-lambda-powertools/logger, and import - run code snippet with lambda
Possible Solution
Could this be related to the internal buffer powertools logger is setting? There is a buffer before the logger instance is initialized, but I am unsure what is happening during initialization.
Powertools for AWS Lambda (TypeScript) version
2.7.0
AWS Lambda function runtime
18.x
Packaging format used
npm
Execution logs
with powertools logger:
REPORT RequestId: <redacted> Duration: 29762.11 ms Billed Duration: 29763 ms Memory Size: 10240 MB Max Memory Used: 3339 MB
with console.log:
REPORT RequestId: <redacted> Duration: 21149.55 ms Billed Duration: 21150 ms Memory Size: 10240 MB Max Memory Used: 83 MB Init Duration: 200.98 ms
Thanks for opening your first issue here! We'll come back to you as soon as we can. In the meantime, check out the #typescript channel on our Powertools for AWS Lambda Discord: Invite link
Hi @Dingway98, thank you for opening the issue.
Please allow us some time to triage the issue and try reproducing the behavior.
I'll get back to you here as soon as I have news to share.
To help narrow things down for you I went back to version 0.2.0 (way back) and this for the same function produced a normal amount of memory usage: Max Memory Used: 125 MB.
But this happens in both the latest v1 and latest v2, so something was added in v1/pre-v1 that is causing this.
Edit: it is somewhere between 0.7.2 (good) and 0.8.1 (bad) Edit: From my testing 0.8.1 is the culprit with PR https://github.com/aws-powertools/powertools-lambda-typescript/pull/748/files
This does not look an issue on powertools, but rather on the AWS Image for Node, if we replace the logger to:
import { Console } from 'console';
const logger = new Console({
stdout: process.stdout,
stderr: process.stderr,
})
We still have the memory issue.
A quick fix for powertools is just to remove the reference for Console and use directly the global console object.
Edit: I tried to test that sample of code on node locally but I didn't see any memory issue, I also tried to take a look at the nodejs runtime and they only do a simple log patch but didn't modify/use the Console object.
Edit2: On node 20.x we also have the error with memory.
I did some investigation and looks like the issue is related to the stdout and stderr not being synchronous and the code not giving the event loop time necessary to flush the logs.
A heap snapshot showed the WritableState was holding all the memory:
This data is being buffered because the logs when using Console and stdout are not blocking, and they are being buffered until the next tick when they are flushed.
If you add something like await new Promise(resolve => setTimeout(resolve, 2)); just to give to Node.js some time to flush the logs, you will not see the memory issue.
Also, you can make the stdout and stderr synchronous by putting the following code:
if (process.stdout._handle) process.stdout._handle.setBlocking(true);
if (process.stderr._handle) process.stderr._handle.setBlocking(true);
Anyway, feel free to choose between one of these solutions, I can open I PR when you choose how do you want to proceed.
Hi @H4ad, thank you SO much for this investigation and for also providing recommendations!
I've been trying to understand the implications of setting the process.stdout._handle to blocking and from the little info I could find on it in other discussions the consensus seems to be fairly split and in many cases leaning towards it having a detrimental impact on performance:
- https://github.com/nodejs/node/issues/11568
- https://github.com/nodejs/node/issues/6379
While I can verify that setting these two lines drops the memory usage to similar levels to the regular console.log, I'd like to spend a bit more time investigating and understanding the root cause & solution a bit better.
To try isolating the issue, I created two new test cases, neither of which uses the Powertools for AWS Logger and that still reproduce the issue. The main difference in both these tests is that we are stringifying the log similar to how Logger would.
In sample 1, I am using the patched console.log as it appears in the Node.js RIC here:
const serviceName = 'exampleService';
export const handler = (event: unknown, context: unknown) => {
const largeObject = {
prop1: 'prop',
prop2: 'prop',
prop3: 'prop',
prop4: 'prop',
prop5: 'prop',
prop6: 'prop',
prop7: 'prop',
prop8: 'prop',
prop9: 'prop',
prop10: 'prop',
prop11: 'prop',
prop12: 'prop',
prop13: 'prop',
prop14: 'prop',
prop15: 'prop',
prop16: 'prop',
prop17: 'prop',
prop18: 'prop',
prop19: 'prop',
prop20: 'prop',
prop21: 'prop',
prop22: 'prop',
prop23: 'prop',
prop24: 'prop',
prop25: 'prop',
prop26: 'prop',
prop27: 'prop',
prop28: 'prop',
};
console.log(JSON.stringify(event));
console.log(JSON.stringify(context));
console.log('hello world');
for (let i = 0; i < 100000; i++) {
const line = JSON.stringify({
timestamp: new Date().toISOString(),
level: 'INFO',
service: serviceName,
message: 'hello world',
...largeObject,
});
console.log(line);
}
return 'done';
};
In sample 2, I am instead printing directly to process.stdout:
const serviceName = 'exampleService';
export const handler = (event: unknown, context: unknown) => {
const largeObject = {
prop1: 'prop',
prop2: 'prop',
prop3: 'prop',
prop4: 'prop',
prop5: 'prop',
prop6: 'prop',
prop7: 'prop',
prop8: 'prop',
prop9: 'prop',
prop10: 'prop',
prop11: 'prop',
prop12: 'prop',
prop13: 'prop',
prop14: 'prop',
prop15: 'prop',
prop16: 'prop',
prop17: 'prop',
prop18: 'prop',
prop19: 'prop',
prop20: 'prop',
prop21: 'prop',
prop22: 'prop',
prop23: 'prop',
prop24: 'prop',
prop25: 'prop',
prop26: 'prop',
prop27: 'prop',
prop28: 'prop',
};
for (let i = 0; i < 100000; i++) {
let line = JSON.stringify({
timestamp: new Date().toISOString(),
level: 'INFO',
service: serviceName,
message: 'hello world',
...largeObject,
});
line = line.replace(/\n/g, '\r');
process.stdout.write(`${line}\n`);
}
return 'done';
};
I then tested both:
- sample 1 (
console.log) -Duration: 2308.49 ms Billed Duration: 2309 ms Memory Size: 10240 MB Max Memory Used: 77 MB Init Duration: 203.04 ms - sample 2 (
process.stdout.write) -Duration: 1585.19 ms Billed Duration: 1586 ms Memory Size: 10240 MB Max Memory Used: 374 MB Init Duration: 213.40 ms
Both these tests have shorter memory usage and duration from the ones in the OP because I have reduced the amount of logs being printed by 1/7th, however we can still see that when using process.stdout.write the memory usage is several orders of magnitude higher than when using console.log.
Setting the process.stdout._handle to blocking in the second sample removes the memory usage and brings it to a similar level to the console.log sample.
This data is being buffered because the logs when using Console and stdout are not blocking, and they are being buffered until the next tick when they are flushed.
This makes sense, but then why is this line not affected since the patched console.log also writes to stdout??
I'm going to reach out to the Lambda team to see if they can look into this and have an explanation on why we're seeing this behavior.
This makes sense, but then why is this line not affected since the patched console.log also writes to stdout??
Good question, I read the source code of Node and I didn't found a difference between using Console or using process.stdout.write or console.log.
This difference seems to be something particular only to AWS, maybe the implementation is patched again in another part of the code.
Hi @H4ad! Thanks a lot for looking into this issue and giving us some direction to fix it. We are working with Lambda engineers to understand the root cause of this issue because as you mentioned, it seems like something very specific to the node embedded in the Lambda node image.
Você é o cara! ❤️
Hello,
We're still waiting on the internal team from AWS Lambda to provide a path forward within the runtime. They are aware of the issue and have already started looking into it.
We are in close contact with them and will share more info here as soon as we can.
Any updates on this @dreamorosi?
Hello, no update - team's been busy with new runtime launches and re:Invent.
I'll try following up with the engineers most likely after next week.
Makes sense, thanks!
Makes sense, thanks!
Reached out to the Lambda team again, no significant update yet besides the fact that this is on their backlog.
I'm considering enabling the workaround/fix proposed above by default but I am not sure if it'll have a negative impact.
Basically in this method, add these two lines:
private setConsole(): void {
if (!this.getEnvVarsService().isDevMode()) {
this.console = new Console({
stdout: process.stdout,
stderr: process.stderr,
});
+ if (process.stdout._handle) process.stdout._handle.setBlocking(true);
+ if (process.stderr._handle) process.stderr._handle.setBlocking(true);
} else {
this.console = console;
}
/**
* Patch `console.trace` to avoid printing a stack trace and aligning with AWS Lambda behavior - see #2902
*/
this.console.trace = (message: string, ...optionalParams: unknown[]) => {
this.console.log(message, ...optionalParams);
};
}
Based on my tests this drops the memory usage to the same amount of using the patched console.log.
Thanks for following up @dreamorosi! Would be willing to test it or patch our packages with this change if you want some confidence
I have been testing the mitigation above and while it works in my limited testing, I am not sure it's the right path forward.
Ultimately setting the stdout operation to blocking is too risky to roll out to customers. I am not 100% sure of the implications of doing this, but based on the reading I've done depending on the frequency and size of logging it could cause bckpressure in the event loop and even end up blocking/pausing the execution until the buffer drains. Which is not good.
Additionally, the _handle method we'd be using appears to be an internal implementation detail (aka not part of the public API), which makes adoption risky, since there's no guarantee that it'll behave the same or even still be there in future runtime updates.
I am still advocating for the issue to be fixed in the runtime or for a more stable alternative.
As a follow up from the Lambda runtime team, I was informed that my initial assessment of how the Node.js managed runtime works was incorrect and that logs are emitted using something like writeFileSync(1, line+ '\n', { flag: 'a' }); when using the globally patched console.log.
I did some tests and while this solves the issue, it seems to cause other kinds of back pressure errors when logging very frequently:
ERROR Invoke Error {"errorType":"Error","errorMessage":"EAGAIN: resource temporarily unavailable, write","code":"EAGAIN","errno":-11,"syscall":"write","stack":["Error: EAGAIN: resource temporarily unavailable, write"," at Object.writeSync (node:fs:924:3)"," at writeFileSync (node:fs:2446:26)"," at console.info (file:///var/task/index.mjs:4:23)"," at t.printLog (file:///var/task/index.mjs:9:6521)"," at t.processLogItem (file:///var/task/index.mjs:9:6975)"," at t.info (file:///var/task/index.mjs:9:2216)"," at Runtime.ii [as handler] (file:///var/task/index.mjs:9:11902)"," at Runtime.handleOnceNonStreaming (file:///var/runtime/index.mjs:1173:29)"]}
I'll need to investigate more, and will continue to follow up with the team.
Instead of set the handle to block, instead you could implement a buffer and flush every X bytes synchronously, and also schedule it to flush on the next tick (to ensure it won't pile up indefinitely)
This will ensure you will not do sync writes everytime and also will prevent OOM.
This is the strategy for pinojs using https://github.com/pinojs/sonic-boom