flink-cdc icon indicating copy to clipboard operation
flink-cdc copied to clipboard

[metric] fix flink cdc metric conflict with flink built in metric

Open zhmin opened this issue 3 years ago • 7 comments

CurrentEmitEventTimeLag and currentFetchEventTimeLag metrics are both registered in flink cdc and flink framework.Flink MetricRegistryImpl is responsible for metric registering, when the metric with same name is registered twice, only the first will be successfully.

Flink cdc metric register after flink framework, so flink cdc metric will be ignored. The following log proves it: 2022-07-29 11:41:58.031 WARN org.apache.flink.metrics.MetricGroup [] - Name collision: Group already contains a Metric with the name 'currentEmitEventTimeLag'.Metric will not be reported.

To deal with the conflict, i suggest add cdc prefix to metric name.

zhmin avatar Jul 29 '22 03:07 zhmin

Thanks for your work , @zhmin . I run the test MySqlConnectorITCase#testConsumingAllEvents in debug mode. It seems that both sourceIdleTime and currentEmitEventTimeLag will be ignored, which are registered in the parent metric group. But the currentFetchEventTimeLag metric is not ignored when I run the test. The code in Flink lies in the InternalSourceReaderMetricGroup.java. 截屏2022-07-29 下午5 50 17

I don't think renaming is the best way to fix this. Maybe we need a new own metric group for our metrics. Or we should use the metrics in the parent metric group. @leonardBang What do you think ?

ruanhang1993 avatar Jul 29 '22 09:07 ruanhang1993

yes, you are right! Only sourceIdleTime and currentEmitEventTimeLag metric will be ignored. currentFetchEventTimeLag metric is not ignored , the comment what i copy is mistake. I have correct it ! If we don't rename metric, creating sub metric group for cdc metric, is better idea. The code we modifide is less, just call addGroup method to create new metric group.

MetricGroup taskManagerMetricGroup= (MetricGroup) metricGroupMethod.invoke(readerContext);
final MetricGroup metricGroup = taskManagerMetricGroup.addGroup("cdc")

zhmin avatar Jul 29 '22 11:07 zhmin

CurrentEmitEventTimeLag and currentFetchEventTimeLag metrics are both registered in flink cdc and flink framework.Flink MetricRegistryImpl is responsible for metric registering, when the metric with same name is registered twice, only the first will be successfully.

Flink cdc metric register after flink framework, so flink cdc metric will be ignored. The following log proves it: 2022-07-29 11:41:58.031 WARN org.apache.flink.metrics.MetricGroup [] - Name collision: Group already contains a Metric with the name 'currentEmitEventTimeLag'.Metric will not be reported.

To deal with the conflict, i suggest add cdc prefix to metric name.

I encountered a similar problem, I don't know if it is yours, my error is Name collision: Group already contains a Metric with the name 'currentEmitEventTimeLag'. Metric will not be reported。 I found in debugging that the problem is on the mysql connector,,Adjust mysql-connect to version before 8.0.26 and no longer report this error。 Uploading 97333CCC-8720-4D69-83FC-4E431511CEE1.png…

ai-smalleryu avatar Oct 10 '22 10:10 ai-smalleryu

Meet the same situation, Is this PR still in progress? willing to complete the subsequent development work If need.

lvyanquan avatar Jul 16 '23 15:07 lvyanquan

@ruanhang1993 @lvyanquan can anyone help with code review?

zhmin avatar Jul 22 '23 08:07 zhmin

It works for me.

lvyanquan avatar Jul 25 '23 01:07 lvyanquan

Hi @zhmin, thanks for your contribution! Seems this issue has been fixed in #2415, so I'm going to close this PR. Feel free to reopen it if you have other concerns.

yuxiqian avatar Apr 25 '24 07:04 yuxiqian