lsquic icon indicating copy to clipboard operation
lsquic copied to clipboard

lsquic client occasionally stops sending ACKs on larger downloads

Open koujaz opened this issue 1 year ago • 3 comments

Downloading larger files (100MB) with lsquic client occasionally fails because the client is not sending ACKs for too long.

Here is what we observe:

  • large portion of the download goes smoothly
  • sometimes there is number of incoming packets in row without any client ACKs (so far no issue but let's call this "smaller hole")
  • but eventually this "hole" without ACKs becomes too large (>2000 packets, >6seconds) so the server closes the connection with "Network black hole detected"
  • client responds to CC immediately with ACK containing the full range of packets

Notes:

  • using client built from latest master, commit b0bd690
  • server is Akamai quic server
  • there is GSO on the server side (should be unrelated; not sure if lsquic uses recvmmsg but there are individual incoming packets seen on the network device so no GRO on receiver side)
  • we have full tcpdump with the failed download
  • client ran w/ -o delayed_acks=0 on the command line

See screenshots for pieces of tcpdump on client side showing:

  • start and end for random "smaller hole" in the middle of the download (2.6 seconds, 2442 packets) smaller_hole_begin smaller_hole_end
  • start and end for final "lethal hole" which ended w/ CC from the server (6.6 seconds, 2445 packets) last_hole_begin last_hole_end

Full client stderr during failed download: client.stderr.log

koujaz avatar Dec 16 '24 09:12 koujaz

You can try disabling delayed ACK feature with settings->es_delayed_acks = 0. Also 2.4.1 is too old. maybe you should try the latest 4.0.x ?

litespeedtech avatar Dec 16 '24 15:12 litespeedtech

Will try to disable delayed ACK, thank you.

Regarding the version the number: 2.4.1 is misleading; sorry. I was tricked by git describe giving me "v2.4.1-338-gb0bd690". We regularly build from master and the commit above is less than a month old.

koujaz avatar Dec 16 '24 16:12 koujaz

I see that the client in the failed case above was already called w/ -o delayed_acks=0 on the command line. So it is already disabled.

koujaz avatar Dec 17 '24 10:12 koujaz

Hi @koujaz -- thank you for the bug report.

Can you recommend an Akamai endpoint to fetch some large files from?

dtikhonov avatar Oct 08 '25 23:10 dtikhonov

@dtikhonov https://dlm.akamai.com/test/1GBfile.bin

vranem1 avatar Oct 09 '25 06:10 vranem1

I cannot reproduce this -- with or without the delayed_acks setting. The largest interval between sent ACKs I observe is 200 - 300 ms.

dtikhonov avatar Nov 01 '25 12:11 dtikhonov

I can confirm that we haven't see the issue in the nightly testing during the last few month. Feel free to close this. We would reopen it in case the issue happens again.

vranem1 avatar Nov 12 '25 14:11 vranem1