seatunnel
seatunnel copied to clipboard
[Feature][zeta][monitoring] Support exposing monitoring metrics by prometheus exporter protocol (#5070)
PR for #5070 The final effect : https://github.com/kim-up/incubator-seatunnel/blob/telemetry/docs/en/seatunnel-engine/telemetry.md
Overall Design
Node Hazelcast Metrics
Getting metrics from ManagedExecutorServiceMBean
- systemExecutorMBean
- asyncExecutorMBean
- scheduledExecutorMBean
- clientExecutorMBean
- clientQueryExecutorMBean
- clientBlockingExecutorMBean
- queryExecutorMBean
- ioExecutorMBean
- offloadableExecutorMBean
- partitionServiceMBean
Including, executor_executedCount、executor_isShutdown、executor_maxPoolSize 、executor_poolSize、executor_queueRemainingCapacity、executor_isShutdown、executor_queueSize
Zeta Thread Pool Metrics
Getting metrics from CoordinatorService#(ThreadPoolExecutor) executorService
Including, pool_activeCount、pool_corePoolSize 、pool_maximumPoolSize、pool_poolSize、pool_completedTask_total、pool_task_total
Zeta Job Metrics
Getting metrics from CoordinatorService#(IMap<Object, Object>)runningJobInfoIMap
Including, job count in various states
Node Jvm Metrics
Getting metrics from io.prometheus.client.hotspot.prometheus.DefaultExports
Including, Memory、BufferPools、Garbage、Thread、ClassLoading、VersionInf
Cluster metrics
Getting metrics from ClusterService
Including, cluster_info、cluster_time
Purpose of this pull request
Check list
- [x] Code changed are covered with tests, or it does not need tests for reason:
- [ ] If any new Jar binary package adding in your PR, please add License Notice according New License Guide
- [x] If necessary, please update the documentation to describe the new feature. https://github.com/apache/seatunnel/tree/dev/docs
- [ ] If you are contributing the connector code, please check that the following files are updated:
- Update change log that in connector document. For more details you can refer to connector-v2
- Update plugin-mapping.properties and add new connector information in it
- Update the pom file of seatunnel-dist
- [x] Update the
release-note.
Good job.
document has some dead links, please check.
@ic4y @hailin0 @Hisoka-X @EricJoy2048
This feature is not small, may need to discuss further development.
@ic4y @hailin0 @Hisoka-X @EricJoy2048
This feature is not small, may need to discuss further development.
@TyrantLucifer
@ic4y @hailin0 @Hisoka-X @EricJoy2048
This feature is not small, may need to discuss further development.
Maybe we can mark it unstable. So we can change it after release. This just is an option I offered, not final decided.
@ic4y @hailin0 @Hisoka-X @EricJoy2048 This feature is not small, may need to discuss further development.
Maybe we can mark it unstable. So we can change it after release. This just is an option I offered, not final decided.
+1 Good for quick experiments to improve it
@liugddx @TyrantLucifer @Hisoka-X @hailin0 @EricJoy2048 PTAL
Can we add the slowoperation metric? It's quite valuable for troubleshooting. You can refer to the Hazelcast Management Center for guidance.
Can we add the slowoperation metric? It's quite valuable for troubleshooting. You can refer to the Hazelcast Management Center for guidance.
Good idea! I will research it later. After the current PR is merged, create a new PR to improve it, because the current PR is relatively large.
@TyrantLucifer @kim-up
we need this feature does it planned to be merged?
Thanks
@kim-up good pr, please fix the conflict
Please merge the dev branch and wait for ci to run successfully @kim-up
@TyrantLucifer @kim-up @hailin0 we need this feature too, does it planned to be merged? Thanks
@kim-up we’re excited to see this merged. Thank you for your contribution.
We should continue push this pr. @kim-up Could you fix conflict? cc @EricJoy2048 @hailin0
@kim-up I intend to continue this feature of yours, as it has been too long since the last update. I will be creating a new branch and porting your code to implement the necessary changes. Please be aware of this and looking forward to cooperating with you,TKS! :>