package-feeds icon indicating copy to clipboard operation
package-feeds copied to clipboard

Large NPM package response data causes timeout

Open maxfisher-g opened this issue 1 year ago • 5 comments

After fixes for #139, remaining packages that still cause timeouts all have very large response data sizes (e.g. 6MB)

maxfisher-g avatar May 10 '23 00:05 maxfisher-g

Go appears to enable compression (gzip) by default on requests.

calebbrown avatar May 10 '23 00:05 calebbrown

Looking into this further, NPM supports HTTP2. I suspect that there is some weird behavior w.r.t timeouts and HTTP2 multiplexing.

calebbrown avatar May 10 '23 07:05 calebbrown

Inspecting the network traffic, there is only 1 connection being opened to https://registry.npmjs.org/, so yes, this is multiplexing queries over a single TCP+TLS connection to NPM.

This means that the large repos are congesting the single multiplexed HTTP2 connection. While the aggregate of the responses will be received faster than individual HTTP1 connections, each individual response is slower than if it was not multiplexed.

Some ideas for improving the performance:

  • remove the limited workers, and just fire all the requests concurrently.
  • add a small sleep (e.g. 0.1s) between each request to give the server some space.
  • increase the timeout significantly (e.g. 1m30s) to allow for the slower responses.

calebbrown avatar May 10 '23 21:05 calebbrown

Thanks for investigating this @calebbrown!

maxfisher-g avatar May 11 '23 05:05 maxfisher-g

We still have errors here. Another attempt to reduce this will be to add an LRU cache and ETag "If-None-Match" checking on requets to NPM for a given package.

calebbrown avatar Jun 28 '23 06:06 calebbrown