Zeno icon indicating copy to clipboard operation
Zeno copied to clipboard

Too many threads

Open vbanos opened this issue 3 months ago • 3 comments

I see the following architecture problem in Zeno: Each component runs on a different thread and launches several sub-threads which also launch several sub-threads.

I know that golang threads have very little performance impact but does it make sense to have such a complicated design? Maybe we could simplify a bit without losing performance to help maintainability.

For example, lets see HQ:

  1. HQ is one of the components running on a separate thread.
  2. Two of its functions are producer() and producerReceiver() which run on their own sub-threads. They have then own channels and their own batching. with configurable number of workers via getMaxProducerSenders() and getProducerBatchSize(). https://github.com/internetarchive/Zeno/blob/main/internal/pkg/source/hq/producer.go
  3. Moreover they run separate threads for each batch. https://github.com/internetarchive/Zeno/blob/main/internal/pkg/source/hq/producer.go#L153

Why don't we run batches on the same thread? Since its a 3rd level thread, it wouldn't block anything. Also, there is buffering in channels between the components.

Another example is the Finisher:

  1. Finisher runs on a separate thread.
  2. Two of its functions finisherReceiver and finisherDispatcher run on their own sub-threads.
  3. finisherDispatcher runs finisherSender in separate sub-threads. Do we need the 3rd level of sub-threads?

Thank you.

vbanos avatar Sep 10 '25 08:09 vbanos

I get what you're saying, maybe there are some places where we overuse goroutines, but in general as you said it's so much not expensive that unless it's really stupid to do so, it's not a big deal. I'm always fully open to changes though, especially when it comes to simplifying the code.

CorentinB avatar Sep 10 '25 11:09 CorentinB

@NGTmeaty thoughts? (just trying to know if we should keep this discussion going or close this issue, or create more specific/detailed issues)

CorentinB avatar Sep 23 '25 08:09 CorentinB

Sorry, we talked off of Github for this one. I concur that goroutines are very lightweight and likely do not contribute any performance penalty. In general, we should try to simplify the code where possible, but I'm not sure a specific audit on goroutines is necessary. That being said, if there were specific cases (HQ, Finisher) where an excessive number of goroutines were being created, I think it would be a good idea to audit those and fix where necessary.

NGTmeaty avatar Sep 23 '25 23:09 NGTmeaty