datadog-ci
datadog-ci copied to clipboard
Socket hang up while uploading sourcemaps
Bug description
Sourcemaps upload error, even the file size < 500K
# List the source maps in the build directory
du -h build/static/js/*.map
56.0K build/static/js/1.70c9dc61.chunk.js.map
16.0K build/static/js/1026.d629a7dd.chunk.js.map
4.0K build/static/js/1039.2ff430f0.chunk.js.map
20.0K build/static/js/1062.5aea4764.chunk.js.map
376.0K build/static/js/1235.02d92438.chunk.js.map
4.0K build/static/js/1376.b953db64.chunk.js.map
76.0K build/static/js/1406.a7cf9fed.chunk.js.map
100.0K build/static/js/1410.6dd2b10b.chunk.js.map
4.0K build/static/js/1411.49d77a9e.chunk.js.map
804.0K build/static/js/1436.5b6a2334.chunk.js.map
108.0K build/static/js/1510.978b7d6a.chunk.js.map
12.0K build/static/js/1535.b4728b50.chunk.js.map
40.0K build/static/js/1631.de4a17e5.chunk.js.map
88.0K build/static/js/1690.05ce4a2e.chunk.js.map
56.0K build/static/js/1745.b9831f5b.chunk.js.map
8.0K build/static/js/1773.2079c7ce.chunk.js.map
32.0K build/static/js/1802.c7d66a91.chunk.js.map
4.0K build/static/js/1814.ac02fc4f.chunk.js.map
4.0K build/static/js/1817.76e878cd.chunk.js.map
284.0K build/static/js/1820.74ed522c.chunk.js.map
4.0K build/static/js/1851.24fd3196.chunk.js.map
76.0K build/static/js/1871.684c27d2.chunk.js.map
396.0K build/static/js/1908.cea29a08.chunk.js.map
108.0K build/static/js/1915.b5e8f2d3.chunk.js.map
40.0K build/static/js/193.d6ded0ee.chunk.js.map
64.0K build/static/js/1959.94f0d86b.chunk.js.map
28.0K build/static/js/2014.db213200.chunk.js.map
1.4M build/static/js/2076.332b43c7.chunk.js.map
4.0K build/static/js/212.c2f81b7f.chunk.js.map
160.0K build/static/js/2245.d3c6fb67.chunk.js.map
48.0K build/static/js/2258.6e69318b.chunk.js.map
8.0K build/static/js/2285.bb48db85.chunk.js.map
4.0K build/static/js/2403.d998609e.chunk.js.map
152.0K build/static/js/2463.baaa35db.chunk.js.map
8.0K build/static/js/2487.a2220685.chunk.js.map
40.0K build/static/js/2537.209c5a7d.chunk.js.map
12.0K build/static/js/2544.660baf40.chunk.js.map
124.0K build/static/js/2592.a4d6bff6.chunk.js.map
8.0K build/static/js/2593.1624ef93.chunk.js.map
24.0K build/static/js/2602.668b6959.chunk.js.map
4.0K build/static/js/2717.8441d59e.chunk.js.map
4.0K build/static/js/274.10993e31.chunk.js.map
4.0K build/static/js/2847.f891f160.chunk.js.map
60.0K build/static/js/2874.3b934815.chunk.js.map
32.0K build/static/js/2890.5c46f2ae.chunk.js.map
16.0K build/static/js/3135.7d0e2ae5.chunk.js.map
4.0K build/static/js/3145.04636504.chunk.js.map
4.0K build/static/js/3157.8301a3aa.chunk.js.map
16.0K build/static/js/3161.84581ad8.chunk.js.map
36.0K build/static/js/3202.e66f94ab.chunk.js.map
20.0K build/static/js/3207.2536ef96.chunk.js.map
44.0K build/static/js/3316.b3bfbb39.chunk.js.map
372.0K build/static/js/3442.96e9f1bf.chunk.js.map
12.0K build/static/js/3445.e25a1da9.chunk.js.map
4.0K build/static/js/3498.1ef6fa0a.chunk.js.map
32.0K build/static/js/3592.341a18bd.chunk.js.map
64.0K build/static/js/3594.f24f0c1c.chunk.js.map
4.0K build/static/js/3625.260409ab.chunk.js.map
124.0K build/static/js/3643.e526bf4a.chunk.js.map
52.0K build/static/js/3646.bb874758.chunk.js.map
4.0K build/static/js/3706.c381f4ed.chunk.js.map
16.0K build/static/js/3726.83a14433.chunk.js.map
4.0K build/static/js/3823.a140d01e.chunk.js.map
296.0K build/static/js/3935.fe843014.chunk.js.map
6.9M build/static/js/3943.5645ddf0.chunk.js.map
8.0K build/static/js/3974.d599deaa.chunk.js.map
208.0K build/static/js/400.6a080db6.chunk.js.map
28.0K build/static/js/4074.55b3a1d0.chunk.js.map
84.0K build/static/js/4088.421141e1.chunk.js.map
4.0K build/static/js/4183.fd0d401e.chunk.js.map
80.0K build/static/js/4190.c14909eb.chunk.js.map
12.0K build/static/js/4195.179f747c.chunk.js.map
12.0K build/static/js/4349.fe97b90b.chunk.js.map
108.0K build/static/js/4364.91479dba.chunk.js.map
8.0K build/static/js/4365.2968eeb4.chunk.js.map
20.0K build/static/js/4389.00801080.chunk.js.map
24.0K build/static/js/4569.51c4339e.chunk.js.map
8.0K build/static/js/4584.6a0b19e6.chunk.js.map
28.0K build/static/js/4610.89b0def6.chunk.js.map
36.0K build/static/js/4637.017314c8.chunk.js.map
88.0K build/static/js/464.53463562.chunk.js.map
4.0K build/static/js/4657.e82a1907.chunk.js.map
4.0K build/static/js/4733.0049af43.chunk.js.map
4.0K build/static/js/4748.ee91ff22.chunk.js.map
4.0K build/static/js/4756.05ae94ad.chunk.js.map
24.0K build/static/js/4768.b26b6ede.chunk.js.map
12.0K build/static/js/4807.8b596151.chunk.js.map
8.0K build/static/js/4828.605893b4.chunk.js.map
128.0K build/static/js/4846.13c82be7.chunk.js.map
8.0K build/static/js/4879.ce137c3b.chunk.js.map
60.0K build/static/js/4886.1deeb569.chunk.js.map
36.0K build/static/js/4890.acdbf454.chunk.js.map
60.0K build/static/js/4893.ab54c02f.chunk.js.map
124.0K build/static/js/4944.66aa7825.chunk.js.map
32.0K build/static/js/4960.2ce28a41.chunk.js.map
128.0K build/static/js/4969.eb879c81.chunk.js.map
180.0K build/static/js/4982.38528b8d.chunk.js.map
72.0K build/static/js/4987.b5b6cbff.chunk.js.map
8.0K build/static/js/5041.1e883c68.chunk.js.map
124.0K build/static/js/5049.30dc262a.chunk.js.map
20.0K build/static/js/5080.29bf85f5.chunk.js.map
88.0K build/static/js/5132.5ee55c8b.chunk.js.map
40.0K build/static/js/5168.f9435084.chunk.js.map
28.0K build/static/js/5323.3f863dfb.chunk.js.map
88.0K build/static/js/5332.83fe56e3.chunk.js.map
68.0K build/static/js/5339.446b0e09.chunk.js.map
80.0K build/static/js/5409.8d1ddee6.chunk.js.map
32.0K build/static/js/5462.2da902bf.chunk.js.map
4.0K build/static/js/5517.9418d24f.chunk.js.map
56.0K build/static/js/5554.a8d0ab12.chunk.js.map
48.0K build/static/js/5562.4ab399bf.chunk.js.map
220.0K build/static/js/5580.4d9ec847.chunk.js.map
48.0K build/static/js/5598.0b5d3e60.chunk.js.map
76.0K build/static/js/5651.fce74122.chunk.js.map
24.0K build/static/js/5695.450c4bce.chunk.js.map
160.0K build/static/js/5697.5f683745.chunk.js.map
4.0K build/static/js/573.19568373.chunk.js.map
28.0K build/static/js/5730.0514b984.chunk.js.map
4.0K build/static/js/5794.7acc9a64.chunk.js.map
40.0K build/static/js/5831.42f2cb1f.chunk.js.map
4.0K build/static/js/5849.8bc8c482.chunk.js.map
84.0K build/static/js/5887.f2ecf2e0.chunk.js.map
124.0K build/static/js/5935.a9cf4493.chunk.js.map
4.0K build/static/js/601.27a3723e.chunk.js.map
120.0K build/static/js/6024.f1372195.chunk.js.map
4.0K build/static/js/6046.47b5bcbb.chunk.js.map
224.0K build/static/js/6066.4d38f99c.chunk.js.map
4.0K build/static/js/6110.0aedd425.chunk.js.map
76.0K build/static/js/6145.00df0082.chunk.js.map
4.0K build/static/js/6147.b53c4f70.chunk.js.map
4.0K build/static/js/6156.c0539c86.chunk.js.map
52.0K build/static/js/616.027fc503.chunk.js.map
8.0K build/static/js/6182.8a8ad87c.chunk.js.map
36.0K build/static/js/629.a655c359.chunk.js.map
20.0K build/static/js/6299.813e4e3f.chunk.js.map
32.0K build/static/js/6309.ef4269f7.chunk.js.map
16.0K build/static/js/6324.17753013.chunk.js.map
84.0K build/static/js/6326.4855247a.chunk.js.map
4.0K build/static/js/6397.5de0e694.chunk.js.map
4.0K build/static/js/6413.d694ed6f.chunk.js.map
68.0K build/static/js/6443.563fcfc0.chunk.js.map
8.0K build/static/js/6557.af3d71b6.chunk.js.map
4.0K build/static/js/6581.7db204dd.chunk.js.map
4.0K build/static/js/6666.cda6ddca.chunk.js.map
268.0K build/static/js/6671.091abe0d.chunk.js.map
40.0K build/static/js/6674.358d40f3.chunk.js.map
84.0K build/static/js/6686.f3dab0b8.chunk.js.map
4.0K build/static/js/6742.5cf877a7.chunk.js.map
20.0K build/static/js/6769.8655980e.chunk.js.map
36.0K build/static/js/6810.add16b63.chunk.js.map
64.0K build/static/js/6816.c0328bb7.chunk.js.map
652.0K build/static/js/6831.09fd4e92.chunk.js.map
4.0K build/static/js/6847.2d563d94.chunk.js.map
8.0K build/static/js/6852.57099c68.chunk.js.map
152.0K build/static/js/6872.742e0a1d.chunk.js.map
60.0K build/static/js/6874.ff4a3923.chunk.js.map
8.0K build/static/js/6892.7e6bc03b.chunk.js.map
8.0K build/static/js/6972.3db02b1c.chunk.js.map
32.0K build/static/js/7042.b762dff5.chunk.js.map
20.0K build/static/js/7056.1aa38b30.chunk.js.map
28.0K build/static/js/7095.ac1f9df2.chunk.js.map
1.5M build/static/js/7221.607edef2.chunk.js.map
24.0K build/static/js/7294.33701be9.chunk.js.map
28.0K build/static/js/7389.f7821441.chunk.js.map
4.0K build/static/js/7399.0f69c786.chunk.js.map
28.0K build/static/js/7484.9e6b6274.chunk.js.map
68.0K build/static/js/7573.c1a2c6db.chunk.js.map
76.0K build/static/js/7576.8a59cdd2.chunk.js.map
296.0K build/static/js/7587.9f9a615f.chunk.js.map
88.0K build/static/js/7605.8d7bf747.chunk.js.map
40.0K build/static/js/7650.59d9dd58.chunk.js.map
32.0K build/static/js/766.f52089c7.chunk.js.map
28.0K build/static/js/7686.32190382.chunk.js.map
4.0K build/static/js/7695.90062550.chunk.js.map
8.0K build/static/js/77.97e35e8c.chunk.js.map
68.0K build/static/js/7703.23b296c1.chunk.js.map
44.0K build/static/js/7739.be985c1c.chunk.js.map
4.0K build/static/js/7760.da1f5dc3.chunk.js.map
36.0K build/static/js/778.3aa82d4a.chunk.js.map
24.0K build/static/js/7805.627d2f3c.chunk.js.map
144.0K build/static/js/7814.ab2bb93f.chunk.js.map
124.0K build/static/js/7825.23952a24.chunk.js.map
284.0K build/static/js/7854.c6382dc3.chunk.js.map
216.0K build/static/js/7972.535d04ee.chunk.js.map
128.0K build/static/js/8001.88a24752.chunk.js.map
52.0K build/static/js/8002.27f9b490.chunk.js.map
684.0K build/static/js/8020.8e09cdc9.chunk.js.map
4.0K build/static/js/8036.84a9c01b.chunk.js.map
4.0K build/static/js/8096.a4ef43a6.chunk.js.map
12.0K build/static/js/8110.e87da554.chunk.js.map
16.0K build/static/js/8122.ab499de3.chunk.js.map
4.0K build/static/js/8123.97497c17.chunk.js.map
12.0K build/static/js/8187.069fcddb.chunk.js.map
196.0K build/static/js/8188.a43323d3.chunk.js.map
32.0K build/static/js/8263.f917a27c.chunk.js.map
36.0K build/static/js/8278.547e26a8.chunk.js.map
4.0K build/static/js/837.bf4c6bce.chunk.js.map
48.0K build/static/js/8376.998ebc27.chunk.js.map
4.0K build/static/js/8411.eaf6e065.chunk.js.map
40.0K build/static/js/8436.33cd090f.chunk.js.map
32.0K build/static/js/8464.ddd63d19.chunk.js.map
260.0K build/static/js/8467.561b23bc.chunk.js.map
32.0K build/static/js/8480.4f2d7d1b.chunk.js.map
40.0K build/static/js/8524.8fe07903.chunk.js.map
72.0K build/static/js/8591.bd9af022.chunk.js.map
24.0K build/static/js/8595.23002a78.chunk.js.map
44.0K build/static/js/8607.ea49a0db.chunk.js.map
4.0K build/static/js/8729.f40d2329.chunk.js.map
4.0K build/static/js/8732.fd772b8e.chunk.js.map
4.0K build/static/js/8758.b5d6cd9a.chunk.js.map
180.0K build/static/js/8776.f742f32c.chunk.js.map
4.0K build/static/js/878.6ae8f5ae.chunk.js.map
20.0K build/static/js/8805.bcf4ab5f.chunk.js.map
16.0K build/static/js/8851.ac33ac5c.chunk.js.map
36.0K build/static/js/8854.2c621baa.chunk.js.map
44.0K build/static/js/8866.e00200c9.chunk.js.map
88.0K build/static/js/8937.1cfe8ae4.chunk.js.map
52.0K build/static/js/8976.8c601738.chunk.js.map
200.0K build/static/js/9016.35feab01.chunk.js.map
12.0K build/static/js/9020.e0d754b8.chunk.js.map
132.0K build/static/js/9057.fd0737c8.chunk.js.map
68.0K build/static/js/9079.7310dd7c.chunk.js.map
4.0K build/static/js/9155.7ad744fc.chunk.js.map
12.0K build/static/js/9179.c6b394f4.chunk.js.map
332.0K build/static/js/9214.e8d008c6.chunk.js.map
32.0K build/static/js/923.83ca250e.chunk.js.map
4.0K build/static/js/93.f2b31674.chunk.js.map
52.0K build/static/js/9317.23dcbdff.chunk.js.map
16.0K build/static/js/9365.c6a3962e.chunk.js.map
108.0K build/static/js/9466.c55e29d8.chunk.js.map
88.0K build/static/js/9482.c9df5356.chunk.js.map
28.0K build/static/js/9484.13a55282.chunk.js.map
20.0K build/static/js/9517.f6fa6612.chunk.js.map
148.0K build/static/js/9533.cd0e09a7.chunk.js.map
60.0K build/static/js/9585.48ff0fb2.chunk.js.map
24.0K build/static/js/9623.861fabe6.chunk.js.map
8.0K build/static/js/9671.cf382f45.chunk.js.map
4.9M build/static/js/9736.436e5c30.chunk.js.map
112.0K build/static/js/9745.c9706529.chunk.js.map
60.0K build/static/js/9801.32db900d.chunk.js.map
4.0K build/static/js/9832.a5fde5ec.chunk.js.map
72.0K build/static/js/9857.622d1b30.chunk.js.map
32.0K build/static/js/9880.3a615e39.chunk.js.map
104.0K build/static/js/9971.2e967695.chunk.js.map
256.0K build/static/js/9994.a4b0f00c.chunk.js.map
24.0K build/static/js/9999.b2d248fe.chunk.js.map
92.0K build/static/js/index.69e25fda.js.map
Describe what you expected
All sourcemaps uploaded without any errors
Steps to reproduce the issue
./node_modules/.bin/datadog-ci sourcemaps upload ./build --service my-service --release-version $version --minified-path-prefix / --max-concurrency 15
Additional context
> node -v
v20.10.0
"@datadog/datadog-ci": "2.24.1"
Command
sourcemaps
Hi @mychaelgo. How often are you experiencing this issue? It looks like this happens because the upload takes too much time. It might be a temporary connectivity issue on your side, for example on a network with very low bandwidth.
@BenoitZugmeyer we're experiencing this issue as well, we can reproduce it consistently when trying to upload thousands of sourcemaps, it's fast for the first minute and then everything starts failing and starts retrying. I suspect the DataDog backend is either rate limiting our CI or the backend is failing due to the volume of requests being made to upload all sourcemaps for our application. In the logs we see a combination of "socket hang up" and "Request failed with status code 408".
Thank you for your feedback. I investigated a bit and found something that might help improve the situation. Stay tuned
@mattlewis92 is this an issue you started experiencing recently, or did you always experienced it? Do your CI runs on Azure/GCP/AWS?
@mattlewis92 is this an issue you started experiencing recently, or did you always experienced it? Do your CI runs on Azure/GCP/AWS?
It's only started happening to us since January 5th, but that build also increased the amount of sourcemaps we were uploading, so it's possible the issue has always been there and we've just hit an upper limit where it starts to trigger. This would make sense as the problem only starts to occur towards the last bunch of sourcemaps uploaded.
Our CI runs on github actions, so would be from the Azure data center under the hood, as we are not using self hosted runners.
Thank you for those informations.
I didn't reproduce the issue even after uploading thousands of sourcemaps at once.
I released #1158 as part of v2.28.0, which might improve the situation. Could you give it a try?
Sure thing! Our next prod release isn't for another week, so will check in then and let you know if the issue is gone!
I took a look at our logs today after upgrading @datadog/datadog-ci to 2.28.0, but am still seeing the socket hang up message. It definitely only seems to happen right at the end of the upload, after we uploaded over 3000 sourcemaps.
Ok, thank you for trying it out.
Maybe with less concurrency, requests will end more quicky, preventing timeouts? Could you try to run the upload command with the --max-concurrency 4 flag? (default is 20)
datadog-ci sourcemaps upload --max-concurrency 4 ...
Ok, thank you for trying it out.
Maybe with less concurrency, requests will end more quicky, preventing timeouts? Could you try to run the upload command with the
--max-concurrency 4flag? (default is 20)datadog-ci sourcemaps upload --max-concurrency 4 ...
Sure thing, will try and that and let you know how it goes!
max-concurrency=4 had no effect, the upload step took about 9 minutes and there was still a bunch of socket hangup errors:
Then I tried deleting a few hundred sourcemaps to bring the total count under 3000 and everything gets fast, and all of them are uploaded in under a minute:
The problem only started after we crossed the 3000 count for sourcemaps, so I'm reasonably confident there is a hard limit somewhere on the datadog side, either connections are not being closed after upload by the client, or maybe the backend API that accepts the uploads is having a firewall rule triggered that starts blocking the requests as it thinks github actions is trying to ddos the endpoint.
Let's keep this issue open until we have confirmation that it is indeed fixed.
Each request has a 1 minute timeout on our side, this is why we are seeing some HTTP requests failing with status 408. When the request was retried, an implementation bug was causing the request to never end, causing the socket hang up errors.
We released a fix for retrying uploads in v2.30.0. We won't increase the 1 minute timeout for now as it has security implications. We might revisit later.
As your source maps files are relatively small, retrying the upload after a timeout should work now that it is fixed. If it's still unreliable and you are still seeing failing HTTP requests, using a lower concurrency might help uploading individual source maps faster. Could you try it again and let me know if it improves the situation?
Sorry for the inconvenience and the back and forth!
Each request has a 1 minute timeout on our side, this is why we are seeing some HTTP requests failing with status 408. When the request was retried, an implementation bug was causing the request to never end, causing the
socket hang uperrors.We released a fix for retrying uploads in v2.30.0. We won't increase the 1 minute timeout for now as it has security implications. We might revisit later.
As your source maps files are relatively small, retrying the upload after a timeout should work now that it is fixed. If it's still unreliable and you are still seeing failing HTTP requests, using a lower concurrency might help uploading individual source maps faster. Could you try it again and let me know if it improves the situation?
Sorry for the inconvenience and the back and forth!
Thanks for digging into this!
After upgrading to 2.30.0 the situation is a bit better, all the socket hangup messages are gone and the upload time has dropped from ~9m to ~5m, although a handful still fail to upload:
When compared to the run where I deleted a few hundred sourcemaps, it's still quite slower:
Will try lowering max-concurrency and see if that makes any difference.
Edit: setting max-concurrency=4 resulted in no errors but uploads took close to 9m this time, max-concurrency=15 resulted in some upload errors and the upload taking about 5 minutes.