Improve scalability and handling of TCP socket closures
Hello, first of all, thank you for providing qpep code. We encountered several problems reported in the log during testing qpep performance. Please help receive it. Thank you!
1、Why is there only "opened a new steam: 1800" and no corresponding closing? This causes the steam ID to accumulate all the time. 2、We have run qpep for more than ten hours. Finally, an error is reported in the log: "temporary error when accepting connection: accept TCP [::]: 8080: accept4: too many open files". What is the problem? 3、When testing qpep performance, you will encounter similar logs:“ Error on copy readfrom tcp 27.19.249.251:443->192.168.21.91:56573:write tcp 27.19.249.251:443->192.168.21.91:56573:use of closed network connection Done sending data on 11984” What is the reason for this?
Thanks for the issue. We'll look into this. The quick answer is that QPEP as written is a proof of concept and hasn't been engineered for long run times / scalability so that's where a lot of this friction is coming from I think, but these are obvious places to start on improving the project.
In terms of issue 1, we just don't recycle stream ids right now. It shouldn't be an issue until you accumulate billions of streams but a different uuid approach could be a good enhancement. Streams are closed, but there's not log output from it right now (or rather, the log output is the cryptic message you're getting in issue 3) .
For issue 2, the file limit on many linux distros is really low. Since QPEP manages lots of concurrent connections, you may have a number of TCP streams which are just waiting for data. There may also be an issue with us leaving orphaned connections with TIMED_WAIT for too long since QPEP keeps streams alive for much longer than standard QUIC servers. The easiest workaround will be to change your system limits for connections as in: https://stackoverflow.com/questions/410616/increasing-the-maximum-number-of-tcp-ip-connections-in-linux
Issue 3 is just a log message from debugging in development. This is where QPEP decides to close sockets. We catch the TCP error and then use that to tell QPEP to close the related connection. You can safely ignore these messages, but we should probably add a way to suppress them as informational.
Thank you for your reply!
We have solved some of our doubts. We'll configure the Linux environment according to your tips, then test it again, and ask you if we have any other questions.
------------------ Original ------------------ From: "ssloxford/qpep" @.>; Date: Fri, Jul 30, 2021 11:08 PM @.>; @.@.>; Subject: Re: [ssloxford/qpep] A problem in qpep testing? (#5)
Thanks for the issue. We'll look into this. The quick answer is that QPEP as written is a proof of concept and has been engineered for long run times / scalability so that's where a lot of this friction is coming from I think, but these are obvious places to start on improving the project.
In terms of issue 1, we just don't recycle stream ids right now. It shouldn't be an issue until you accumulate billions of streams but a different uuid approach could be a good enhancement. Streams are closed, but there's not log output from it right now (or rather, the log output is the cryptic message you're getting in issue 3) .
For issue 2, the file limit on many linux distros is really low. Since QPEP manages lots of concurrent connections, you may have a number of TCP streams which are just waiting for data. There may also be an issue with us leaving orphaned connections with TIMED_WAIT for too long since QPEP keeps streams alive for much longer than standard QUIC servers. The easiest workaround will be to change your system limits for connections as in: https://stackoverflow.com/questions/410616/increasing-the-maximum-number-of-tcp-ip-connections-in-linux
Issue 3 is just a log message from debugging in development. This is where QPEP decides to close sockets. We catch the TCP error and then use that to tell QPEP to close the related connection. You can safely ignore these messages, but we should probably add a way to suppress them as informational.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.
According to the link you gave, you configure the relevant parameters of Linux, run the qpep client and server for about 20 hours. There are second errors mentioned in the previous question. I attached the picture of the server side in the mail, and I couldn't access the Internet at this time. But iperf TCP is normal. Please help analyze and see where the problem may be.
------------------ Original ------------------ From: "ssloxford/qpep" @.>; Date: Fri, Jul 30, 2021 11:08 PM @.>; @.@.>; Subject: Re: [ssloxford/qpep] A problem in qpep testing? (#5)
Thanks for the issue. We'll look into this. The quick answer is that QPEP as written is a proof of concept and has been engineered for long run times / scalability so that's where a lot of this friction is coming from I think, but these are obvious places to start on improving the project.
In terms of issue 1, we just don't recycle stream ids right now. It shouldn't be an issue until you accumulate billions of streams but a different uuid approach could be a good enhancement. Streams are closed, but there's not log output from it right now (or rather, the log output is the cryptic message you're getting in issue 3) .
For issue 2, the file limit on many linux distros is really low. Since QPEP manages lots of concurrent connections, you may have a number of TCP streams which are just waiting for data. There may also be an issue with us leaving orphaned connections with TIMED_WAIT for too long since QPEP keeps streams alive for much longer than standard QUIC servers. The easiest workaround will be to change your system limits for connections as in: https://stackoverflow.com/questions/410616/increasing-the-maximum-number-of-tcp-ip-connections-in-linux
Issue 3 is just a log message from debugging in development. This is where QPEP decides to close sockets. We catch the TCP error and then use that to tell QPEP to close the related connection. You can safely ignore these messages, but we should probably add a way to suppress them as informational.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.