scale the socket receive buffer on the fly #39092
As an experiment, I tried scaling the receive buffer on the fly depending on how full it was on the last iteration It has the advantage of using less memory for slow streams and being more efficient for faster streams
It does not allow for manual control (but this can probably be added) and it effectively short-circuits the back pressure mechanism - as the first read always gets through - which is not very well suited for stream originating from the network
Sorry, I meant to refer to #39092
Can't the onread feature of net.Socket already achieve the same goal?
Can't the
onreadfeature ofnet.Socketalready achieve the same goal?
Not if you are piping (unless this is implementing in the piping too)
Not if you are piping (unless this is implementing in the piping too)
I don't follow. If you had socket.push(data) inside the onread callback and had the buffer as a function returning Buffer instances of varying sizes to provide the scaling, would that not allow the socket to be piped like normal?
Not if you are piping (unless this is implementing in the piping too)
I don't follow. If you had
socket.push(data)inside theonreadcallback and had thebufferas a function returningBufferinstances of varying sizes to provide the scaling, would that not allow the socket to be piped like normal?
The way I understand this part of the code, is that this buffer is static and it is reused for every write? Which is a very good idea performance-wise, but I don't think you can make it work with the current piping implementation?
The way I understand this part of the code, is that this buffer is static and it is reused for every write?
Not necessarily, the buffer property can also be a function that returns any Buffer that should be used. So it could return a new Buffer every time.
The way I understand this part of the code, is that this buffer is static and it is reused for every write?
Not necessarily, the
bufferproperty can also be a function that returns anyBufferthat should be used. So it could return a newBufferevery time.
Yes, I agree, I had missed the part in stream_base_commons.js which reallocates the buffer on every read. This would leave it to the user to scale the buffer? Or do you propose that the same mechanism is implemented in the JS-part of the library?
This would leave it to the user to scale the buffer? Or do you propose that the same mechanism is implemented in the JS-part of the library?
I would lean more towards the former unless there is strong evidence with data to back it up that this scaling behavior should existing in node core by default.
This would leave it to the user to scale the buffer? Or do you propose that the same mechanism is implemented in the JS-part of the library?
I would lean more towards the former unless there is strong evidence with data to back it up that this scaling behavior should existing in node core by default.
This is subject of discussion of course.
My rationale is that currently if you have a 20MB/s input stream (my example), this would require 320 event loop iterations to process. An application might have only 10% overall CPU load - if it comes at chunks of 5ms of processing time each, it won't be able to keep up. And the end result won't be a 10MB/s input stream - it would be a constantly stalling input stream - one that sets its TCP window to zero 10 times per second. In my case the 20MB/s stream was going at about 10KB/s.
It is an absolutely horrible behavior.
But in fact, you don't need to change anything in Node to avoid it - setting the highWaterMark of the socket solves the problem - it is just that is not a very well documented parameter (in fact, if you are not aware of Node's internals there is no way you can find it yourself).
- Making an explicit note in the documentation is one solution
- Exposing this parameter in the
http.requestis another - Asking the user to manually scale the buffer works too
- But this solution is automatic and it also has better performance for all high-bandwidth incoming streams, even if the receiving application is not doing anything but waiting for the download to complete
- And there is the simple fact that the backpressure mechanics work very well for pipes where Node controls both ends, but are simply ill-suited for streams coming from the network where Node has no control over the sender
The net benchmarks look fine to me. I wonder if this will have a noticeable impact on memory usage one way or the other in practice?
The
netbenchmarks look fine to me. I wonder if this will have a noticeable impact on memory usage one way or the other in practice?
It uses less memory for slow connections - and when it comes to fast connections don't forget that the kernel too allocates up to 256Kb per successful TCP connection (ie after the SYN exchange) - so normally, if it worked before, it should continue to work
I will be on solar power for some time, so I can't build Node from here to test it
I made a draft PR because there a few points that need discussing:
- Is scaling back really needed?
- I had horrible yoyo effects when I tried scaling back at 1/2
- Should we break bad applications that are used to receiving 64k per read
- Should we break bad applications that are used to receiving 64k per read
Seeing as how we're dealing with streams here, nobody should be relying on chunk sizes.
What's the status here? If we still want to land this, it should be rebased on top of main.