opentelemetry-collector
opentelemetry-collector copied to clipboard
[scraperhelper] Can't run scrapers in parallel
Component(s)
scraper/scraperhelper
Describe the issue you're reporting
As reported in this issue for the sqlqueryreceiver, the scraperhelper's controller always runs scrapers in series. In the case of the sqlqueryreceiver this means that each query is run sequentially rather than leveraging the connection pool options and running in parallel.
I think there are a few options for how to improve the behavior.
-
No change to scraperherlper. To get the benefits of a connection pool for example, that logic will need to be embedded inside a single scraper instead of the current pattern in the sqlqueryreceiver of one scraper per query.
-
Always parallelize scrapers. This might have issues in some cases, if scrapers could potentially conflict with each other. However, it feels like what I would expect the behavior to be as a user especially if the scraperhelper package continues to evolve (some examples in https://github.com/open-telemetry/opentelemetry-collector/issues/11238)
-
Configurable parameter in the scraper controller to run scrapers in parallel. The benefits of parallel without potentially breaking some existing uses of the package. I think parallel should be the default, but if we don't want to change existing behavior it could be opt-in.
-
Configurable parameter(s) in the scraper definition to define if an individual scraper should be run in parallel. This feels excessive to me, but allows for the case where some scrapers must be run exclusive of each other. We could get deep in the weeds here with marking dependencies/conflicts and dividing sets of scrapers that can run in parallel, etc if that's something we want to support.
I'm in favor of changing the behavior to always parallelize (1), however existing uses of the scraper packages will need to be evaluated to be sure this is safe.