erlang-stdinout-pool
erlang-stdinout-pool copied to clipboard
Head's up about large binary support implementation
Hi, just wanted to let you know that I've been using this in production for quite a while and made quite a few improvements:
- support for binary in and output
- avoid deadlocking on large in and outputs (some processes write to stdout before fully reading stdin, on large streams this can lead to deadlocks where parent and child are waiting for the other to read from their buffer).
- ignoring stderr when stdout data is present (there's data corruption going on in the current implementation when child is writing to both)
I've adressed these issue for my use case and the code is available here: https://github.com/hrobeers/erlang-stdinout-pool/tree/binary-in
Due to some potential incompatibilities with current other users I did not create any pull requests, but I wanted to make sure you're aware of this and that I can create some pull requests if some of these changes are of interest to you.
Looks like nice changes!
Probably worth adding all your improvements and fixes to this original repo for more people to find. We can also add notes to the README pointing to your more actively maintained repo too in case we get out of sync again (most my erlang work stopped almost 10 years ago, so I just maintain some current systems until they finally collapse for the last time — and this 10+ year old stdinout-pool code is still running on my services too!).
My original implementation certainly isn't optimal with the way it does lazy byte-by-byte reading and writing, but it works 🫠 (and my use cases are mostly for long-lived execute-once, cache-forever assets so the inefficiency never really had any problems).
I see your usleep(100) added to the stdin read loop to pace the read/write interface — is that your "deadlocking on large in and outputs" helper? Seems odd, but there could be an edge case somewhere.
Only style recommendation: run all the *.{c,h} files through this just to tidy them up (any recent clang-format should support these options):
clang-format -style="{BasedOnStyle: llvm, IndentWidth: 4, AllowShortFunctionsOnASingleLine: None, KeepEmptyLinesAtTheStartOfBlocks: false, InsertBraces: true}" c_src/*.c
The fix for the deadlocking is in the double fork in stdin_forcer. This will have one process write to stdin and another read from stdout and stderr. Large outputs on stderr interwoven with stdout might still deadlock the reader, but this is not how most processes behave so I ignored that.
The usleep I forgot about that (it's been a while since I added that), when a write to stdin returns incomplete the usleep gives the child some time to clear it's input buffer before trying again. I'm not sure this sleep is really required as the next write will probably block until there is some room. (Not sure about that).
The reason I chose your implementation in my project is that it's simple and just using plain C for the pipe stuff. And in case of issues I can easily modify it. It has been working really great for me so far, thanks for that!
Ok I'll make a pull request when I get around it! Might take some time, but thanks for the reply!