gatsby-plugin-s3
gatsby-plugin-s3 copied to clipboard
Improve memory performance of redirects
For sites that have a large number of redirects, the existing
implementation will throw a RangeError: Maximum call stack size exceeded
error, as all the redirect jobs are first built into memory, then spread
out into the uploadQueue. This commit tweaks the flow to push the
redirect jobs into the queue using a forEach loop instead, resulting in
less memory pressure.
@jariz just wanted to check in and see if this could be merged. Our project is blocked by this, and we're forced to fork the package until it's resolved. Thanks!
I can see how in sites with tens of thousands of redirects, this might halve memory usage. But can you provide some concrete numbers around this? What memory usage were you seeing prior to this change, and what memory usage do you see afterwards?
Maximum call stack size exceeded
doesn't usually indicate an OOM error. It usually indicates some kind of infinite recursion in the code.
Can you share some instructions on how to reproduce this or even a minimal reproducible example?
I don't really see any reason not to merge this, but first I'd like to have a bit more of an understanding of the problem and confidence that this properly fixes it.
Sure- the site I was trying to deploy had 74,900 redirects defined in s3.redirectObjects.json
. The plugin immediately crashes when attempting to deploy this using Node 16.x. I don't know what the exact threshold is for when Maximum call stack size exceeded
is triggered, however, I can confirm the redirects deploy without issue after this change.
Hey, just checking in. Did you have any more questions on this? I'm currently forced to manually patch my node_modules with this change any time our site needs a full rebuild. Getting this merged in would help our workflows a ton. Cheers!
@JoshuaWalsh I've updated the commit message to remove the comment about memory pressure. However, I can confirm this change fixes the crash that occurs with >70k+ redirects. Can I please get your review on this? Thanks!
Hi @mattrigg9, I'm sorry, I know you've been waiting patiently on this one.
I spent some time today trying to reproduce the issue. I worked with the examples/with-redirects
project and in gatsby-node.js
I added the following:
for(let i = 0; i < 1000000; i++) {
actions.createRedirect({
fromPath: `/bulk/${i}`,
toPath: '/blog/1',
isPermanent: true
});
}
With this, I was able to reproduce the RangeError: Maximum call stack size exceeded
error. I then tried your fix and the error disappeared. Curiously, I was not able to reproduce the issue with only 100,000 redirects, despite the fact that you encounter it at ~70,000. I wonder if it's somehow related to the length of the paths in the redirect.
I'm still extremely curious as to why the RangeError happens (since we're not recursing), and I don't understand why using forEach and push fixes it, but it does fix it and so I will merge the change.
I'm very sorry for the lengthy delay on this one. Since it is such a weird issue I wanted to give it proper attention, but life has been busy.
@jariz Can you please do an npm release when you have time?
Thanks, Josh
Ah, I no longer have permission to manually merge to master, and the e2e tests are not working. @jariz Can you please merge this? (Or fix the tests, or un-protect master
?)
Thanks very much for diving into it Josh! If you couldn't repro it at 100k, then I suspect the path length must have an influence.
I wonder if spreading out the array first would inflate the overall call stack size. Either way, thanks for your attention on this, it will help my workflow a ton.
Hi @jariz, can you please help get this merged? Thanks,
@mattrigg9 Sorry for the lengthy delay on this. It's included in v0.4.0.