Stream concurrency for `strfry router`
I've been reading through the source of src/apps/mesh/cmd_router.cpp and it appears to all run in a single thread with an event loop, I believe this can be an issue when running with hundreds of streams, and would like to add some concurrency to the router.
Reading through other sources such as src/apps/relay/cmd_relay.cpp for strfry relay and multiple ThreadPool instances are used to support concurrency.
Is this a good idea?
Good observation! Yes you are correct that the router's websocket communication is done in a single thread. It isn't quite as sophisticated as the normal relay. However, note that it does offload some especially CPU-intensive tasks like signature verification to other threads such as the WriterPipeline's validator thread.
TBH I would not attempt to modify this architecture unless it becomes obviously a bottleneck. Note that another workaround is to run multiple router instances. You can have 2+ router processes running at the same time, each with their own set of streaming URLs, but both pointed to the same underlying DB. This is a manual way that you can do horizontal scaling if you have a very large number of streaming sources.
I'm not entirely convinced it's a bottleneck yet, however it appears that the necessity for the increased timeout may be due because of it, as all the streams need to be added from the config before any can become connected. Perhaps a break in the loop that adds streams from the config, so it's non-blocking, could provide an opportunity for other async events to be processed?
I haven't monitored closely a long-lived process yet, however it may not be a bottleneck once they are connected, events are not that frequent to be that blocking. The very long timeout though might mean that a stream might go down for many minutes and not reconnect quickly and may miss some events. This isn't an issue as more relays support syncing with negentropy and strfry sync can catch those events.
I've attempted a work-around using many different processes of strfry router, however I ran into an issue with too many open files. I'm assuming this is from LMDB. Perhaps this could be worked around by bumping up this maximum number in the OS? I think I was trying to open up about 190 routers, one for each pubkey that I follow, and each with multiple relays in one stream.