Too many threads
I see the following architecture problem in Zeno: Each component runs on a different thread and launches several sub-threads which also launch several sub-threads.
I know that golang threads have very little performance impact but does it make sense to have such a complicated design? Maybe we could simplify a bit without losing performance to help maintainability.
For example, lets see HQ:
- HQ is one of the components running on a separate thread.
- Two of its functions are
producer()andproducerReceiver()which run on their own sub-threads. They have then own channels and their own batching. with configurable number of workers viagetMaxProducerSenders()andgetProducerBatchSize(). https://github.com/internetarchive/Zeno/blob/main/internal/pkg/source/hq/producer.go - Moreover they run separate threads for each batch. https://github.com/internetarchive/Zeno/blob/main/internal/pkg/source/hq/producer.go#L153
Why don't we run batches on the same thread? Since its a 3rd level thread, it wouldn't block anything. Also, there is buffering in channels between the components.
Another example is the Finisher:
- Finisher runs on a separate thread.
- Two of its functions
finisherReceiverandfinisherDispatcherrun on their own sub-threads. finisherDispatcherrunsfinisherSenderin separate sub-threads. Do we need the 3rd level of sub-threads?
Thank you.
I get what you're saying, maybe there are some places where we overuse goroutines, but in general as you said it's so much not expensive that unless it's really stupid to do so, it's not a big deal. I'm always fully open to changes though, especially when it comes to simplifying the code.
@NGTmeaty thoughts? (just trying to know if we should keep this discussion going or close this issue, or create more specific/detailed issues)
Sorry, we talked off of Github for this one. I concur that goroutines are very lightweight and likely do not contribute any performance penalty. In general, we should try to simplify the code where possible, but I'm not sure a specific audit on goroutines is necessary. That being said, if there were specific cases (HQ, Finisher) where an excessive number of goroutines were being created, I think it would be a good idea to audit those and fix where necessary.