Keep request socket stream always in sync (avoids SIGABORT)
When the request socket stream is under high load a lot of write() calls are failing with EAGAIN. To avoid blocking of a real-time thread there is no chance to try to write again. Therefore the write of the message is aborted. With the current implementation write() will be called for each attribute of a request message. Therefore there is a high probability that a message will be written partly. But the reader expects an continuous stream and has no change to sync on an broken stream.
This PR tries to write one request message with one write() call. Therefore if a write() returns with EAGAIN the complete message will be ignored.
But in case of the write() returns with a partly write operation (e.g. only 8 of 12 bytes were written) there is still a chance to write partly messages and bring the reader out of sync. To overcome this issue I would like to retry to write as long as I am able to write some bytes and I did not get an EAGAIN (Therefore I would expect that the write() should not block a real time thread as long as we are in non-blocking mode).
But also with this implementation there is still a chance of partly written messages. Therefore the reader requires a possibility to detect partly written messages and find the start of the next message. I have added a start and stop integer to the request messages which contains "ReqS" and "ReqE". If the reader does not detect the stop integer at the end of the message. There is a corrupted message. Therefore the reader will ignore this message and will search for the next start integer.
I am ready with this PR. Please feel free to comment and give some feedback.
Only some cleanup of the commits.
@falkTX
Please dont ping developers like that. I will review this myself when I have the time. It is not a simple thing.
Sorry. I do not want to stress you. The guys I have pinged are developers of the same company I am working. We would like to use this PR also for our internal review (to reduce the double efforts of an additional review on our side). Would it be fine for you if we do our internal reviews also in this repository on the same PR or should I create a separate review for my colleges?
having a discussion here is all fine, I got confused by the pinging of people. next time, there is no need for such a comment, if you have internal systems to communicate with them, use them. but communication regarding the actual code, all fine (and good actually) to have it here.
thanks for understanding.
seems like the build failed because of travis issues, otherwise I have tested these changes with following use cases on aarch64 linux:
jackd -R -d alsa -C "hw:3" -P "hw:3" -n 3 -p 96 jack_lsp jack_simple_client connecting capture to playback ports + audio playback from line in to output device
this branch is rebased to latest develop and merge conflicts were resolved by me
There are a lot of changes here, but end result seems well worth it. Please rebase against latest develop branch again, CI has been fixed there since this PR was originally made and then updated.
I really dislike the use of POST_PACKED_STRUCTURE_FORCED_AARCH64.
Is this needed for aarch64 but not others like arm? Perhaps then this is something useful for 64bit systems only?
I really dislike the use of POST_PACKED_STRUCTURE_FORCED_AARCH64.
I have introduced I really dislike the use of POST_PACKED_STRUCTURE_ALWAYS now. This is used for all classes of JackRequest.h. Therefore those classes will always be packed independent of the architecture.