sentry-javascript
sentry-javascript copied to clipboard
ref(node) refactor contextlines to use readline
Refactors contextlines integration to use readline instead of reading the entire file into memory (we were also doubling the memory as we were splitting the file via .split(\n).
Current implementation is not leveraging the cache to optimize file reads, so I need to reintroduce that. Something to note is that the caching behavior will become per frame location vs per file, which I think makes more sense as we can only assume that this stack frame might throw in the future (as opposed to anything from the file it is in).
size-limit report 📦
| Path | Size |
|---|---|
| @sentry/browser | 22.03 KB (0%) |
| @sentry/browser (incl. Tracing) | 33.2 KB (0%) |
| @sentry/browser (incl. Tracing, Replay) | 68.94 KB (0%) |
| @sentry/browser (incl. Tracing, Replay) - with treeshaking flags | 62.27 KB (0%) |
| @sentry/browser (incl. Tracing, Replay with Canvas) | 73.01 KB (0%) |
| @sentry/browser (incl. Tracing, Replay, Feedback) | 85.17 KB (0%) |
| @sentry/browser (incl. Tracing, Replay, Feedback, metrics) | 86.99 KB (0%) |
| @sentry/browser (incl. metrics) | 26.21 KB (0%) |
| @sentry/browser (incl. Feedback) | 38.22 KB (0%) |
| @sentry/browser (incl. sendFeedback) | 26.62 KB (0%) |
| @sentry/browser (incl. FeedbackAsync) | 31.17 KB (0%) |
| @sentry/react | 24.8 KB (0%) |
| @sentry/react (incl. Tracing) | 36.28 KB (0%) |
| @sentry/vue | 26.03 KB (0%) |
| @sentry/vue (incl. Tracing) | 35.06 KB (0%) |
| @sentry/svelte | 22.17 KB (0%) |
| CDN Bundle | 23.36 KB (0%) |
| CDN Bundle (incl. Tracing) | 34.89 KB (0%) |
| CDN Bundle (incl. Tracing, Replay) | 68.99 KB (0%) |
| CDN Bundle (incl. Tracing, Replay, Feedback) | 74.16 KB (0%) |
| CDN Bundle - uncompressed | 68.68 KB (0%) |
| CDN Bundle (incl. Tracing) - uncompressed | 103.29 KB (0%) |
| CDN Bundle (incl. Tracing, Replay) - uncompressed | 213.74 KB (0%) |
| CDN Bundle (incl. Tracing, Replay, Feedback) - uncompressed | 226.2 KB (0%) |
| @sentry/nextjs (client) | 35.61 KB (0%) |
| @sentry/sveltekit (client) | 33.83 KB (0%) |
| @sentry/node | 112.67 KB (+0.57% 🔺) |
| @sentry/node - without tracing | 90.08 KB (+0.64% 🔺) |
| @sentry/aws-serverless | 99.16 KB (+0.6% 🔺) |
It seems that createReadStream error handling was only backported to node 16 and not 14, which would explain the unit test failure on 14
https://github.com/nodejs/node/pull/31603
I am tempted to add a fs.stat here to check if the file exists, but it feels like a very poor fix...
If anyone wants to give this another look, feel free to. I still need to run benchmarks and confirm the tradeoff before merging (sometime this week).
It just occurred to me that we should store the result of snip(line) in our cache and avoid doing the trimming per cache read.
I'll run benchmarks after that change
Seems like dev is broken - the tests fail on span tests which are unrelated.
@AbhiPrasad on a side note, I think we should add some simple heuristic where we short circuit adding context frames past a certain line or column number. For example a colno past a few hundred lines is very likely to be minified code and anything past a few thousand lines is likely bundled.
Yeah shoot - let's tackle this in the morning cc @mydea I think your changes in https://github.com/getsentry/sentry-javascript/pull/12477 are failing on develop.
I think we should add some simple heuristic where we short circuit adding context frames past a certain line or column number
I wonder if we can get avg/p75/p95 of line/col length by analyzing sentry errors, and then deciding a number based on that.
@AbhiPrasad I just ran some CPU benchmarks between 1201eb2d7 and 2bf1b24ce.
The large file was roughly 64KB, ~4k limes while the medium was only 500 lines (others were just simple init sdk and throw cases). In the case of medium and large file benchmarks, the error being thrown was at the last few lines to force a worst case scenario where we need to scan an entire file. As is, there is no limit on the upper bound, but I think it'd be reasonable to set one (even if it's something like 10k lines).
The benchmarks were run using hyperfine with a warmup of 3 cycles and 40 min iteration count.
// hyperfine --warmup 3 -m 40 'node error.js' 'node multiple_errors.js' 'node medium_file.js' 'node medium_file_multiple.js' 'node large_file.js' 'node large_file_multiple.js'
develop (#1201eb2d7)
Benchmark 1: node error.js
Time (mean ± σ): 235.8 ms ± 9.5 ms [User: 224.0 ms, System: 54.8 ms]
Range (min … max): 221.2 ms … 274.3 ms 40 runs
Benchmark 2: node multiple_errors.js
Time (mean ± σ): 250.4 ms ± 12.0 ms [User: 232.6 ms, System: 56.1 ms]
Range (min … max): 232.6 ms … 291.5 ms 40 runs
Benchmark 3: node medium_file.js
Time (mean ± σ): 235.9 ms ± 8.6 ms [User: 226.9 ms, System: 55.6 ms]
Range (min … max): 219.0 ms … 262.3 ms 40 runs
Benchmark 4: node medium_file_multiple.js
Time (mean ± σ): 250.2 ms ± 4.1 ms [User: 227.3 ms, System: 53.9 ms]
Range (min … max): 245.1 ms … 261.6 ms 40 runs
Benchmark 5: node large_file.js
Time (mean ± σ): 234.1 ms ± 9.3 ms [User: 221.6 ms, System: 53.5 ms]
Range (min … max): 227.1 ms … 286.6 ms 40 runs
Benchmark 6: node large_file_multiple.js
Time (mean ± σ): 248.9 ms ± 3.3 ms [User: 227.0 ms, System: 53.7 ms]
Range (min … max): 239.2 ms … 256.1 ms 40 runs
PR (#2bf1b24ce)
Benchmark 1: node error.js
Time (mean ± σ): 235.5 ms ± 7.9 ms [User: 223.3 ms, System: 53.6 ms]
Range (min … max): 223.7 ms … 267.4 ms 40 runs
Benchmark 2: node multiple_errors.js
Time (mean ± σ): 254.8 ms ± 7.6 ms [User: 231.0 ms, System: 54.7 ms]
Range (min … max): 239.9 ms … 286.2 ms 40 runs
Benchmark 3: node medium_file.js
Time (mean ± σ): 241.2 ms ± 18.3 ms [User: 226.5 ms, System: 54.8 ms]
Range (min … max): 228.6 ms … 340.9 ms 40 runs
Benchmark 4: node medium_file_multiple.js
Time (mean ± σ): 254.7 ms ± 8.1 ms [User: 230.2 ms, System: 54.7 ms]
Range (min … max): 245.3 ms … 290.1 ms 40 runs
Benchmark 5: node large_file.js
Time (mean ± σ): 236.6 ms ± 7.7 ms [User: 225.3 ms, System: 53.8 ms]
Range (min … max): 221.9 ms … 271.2 ms 40 runs
Benchmark 6: node large_file_multiple.js
Time (mean ± σ): 251.9 ms ± 6.7 ms [User: 230.7 ms, System: 54.4 ms]
Range (min … max): 237.3 ms … 283.5 ms 40 runs
Something to note is that in a lot of the benchmarks (both previous and current) hyperfine was warnings about statistical outliers - you can see that the ranges for each benchmark vary quite a bit, and seem to have no correlation with the size of the file we are reading. My takeaway is that either my benchmarks are wrong, or the readline approach is not significantly worse in terms of CPU usage than mapping the entire file to memory while giving us a reduced memory usage.
I did try measure the memory usage, but it seemed that it just varies in the range of a few megabytes, which makes it similarly unreliable. Maybe one last thing to test is to throw errors from a few large files and see how it impacts memory usage, wdyt?
I also think that we should set some reasonably high max line (maybe at 10k) to avoid cases where we might be reading a very large bundled file.
These benchmark results make sense to me because I wouldn't have expected the readline approach to be considerably faster, especially when the errors are towards the end of the file (ie. worst case).
Do the v8 debugger "Record Allocation profile" results look considerably different between this branch and develop?
I've never looked into the node readline code before. But now I'm looking at the API, in the worse case scenario, I'm doubtful it ends up allocating less than reading the entire file to string and splitting. Since there's no way to pass string slice references in JavaScript, every call to the line event will require a new string allocation. Most of these will be wasted allocations since we skip all lines up until the first line we're looking for.
To make it faster and allocate less you'd need to skip to the first required line through the raw stream and only then slice the buffer and allocate to strings for the required lines.
These benchmark results make sense to me because I wouldn't have expected the readline approach to be considerably faster, especially when the errors are towards the end of the file (ie. worst case).
Do the v8 debugger "Record Allocation profile" results look considerably different between this branch and develop?
I've never looked into the node readline code before. But now I'm thinking about it, in the worse case scenario, I'm doubtful it actually ends up allocating less than reading the entire file to string and splitting. Since there's no way to pass string slice references in JavaScript, every call to the
lineevent will require a new string allocation. Most of these will be wasted allocations since we skip all lines up until the first line we're looking for.To make it faster and allocate less you'd need to skip to the first required line through the raw Buffer read stream and only then slice the buffer and allocate to strings for the required lines.
Yeah exactly, though the goal here is to reduce peak memory usage while not impact CPU time. I would expect to see fs spike to allocate the entire file while readline would perform smaller allocation, somewhat similar as to what is described in https://betterprogramming.pub/a-memory-friendly-way-of-reading-files-in-node-js-a45ad0cc7bb6 which also indicates that the fs.read could be the next option if we want to optimize the individual string allocations. Worst case here is that we end up reading a large minified file, then none of this applies and we just map the entire string (we could avoid with some sensible max colno after which we just give up on reading such files)
the goal here is to reduce peak memory usage while not impact CPU time.
This PR has already achieved that by using createReadStream which reads in chunks rather than the entire file and re-uses the buffers.
Using readline like this does allocate new strings for every line in the file from 0 to rangeEnd but this is still better than string.split() in everything other than the worst case.
I think that's enough to merge this PR!
To reduce the allocations in the worse cases, we'd want a readline implementation where the line event gets passed a Buffer rather than a string. The Buffer could be created from Buffer.subarray of the read stream which is just a memory reference + length and has no allocation cost. At that point we could check the size of the line and do other sanity checks and only then allocate into a string if we need that line and are happy with the "cost".
the goal here is to reduce peak memory usage while not impact CPU time.
This PR has already achieved that by using
createReadStreamwhich reads in chunks rather than the entire file and re-uses the buffers.Using
readlinelike this does allocate new strings for every line in the file from 0 torangeEndbut this is still better thanstring.split()in everything other than the worst case.I think that's enough to merge this PR!
To reduce the allocations in the worse cases, we'd want a
readlineimplementation where thelineevent gets passed aBufferrather than a string. The Buffer could be created fromBuffer.subarrayof the read stream which is just a memory reference + length and has no allocation cost. At that point we could check the size of the line and do other sanity checks and only then allocate into a string if we need that line and are happy with the "cost".
You nailed it :) It would be super interesting to see how well that performs (though I wont be tackling this anytime soon I think). I'm going to try and see if we have any internal services that we can monitor once this PR goes out.
@JonasBa there is https://github.com/getsentry/chartcuterie/ and https://github.com/getsentry/sentry-docs but we don't have the best infra metrics monitoring for them re: CPU/memory