hudi
hudi copied to clipboard
[HUDI-7395] Fix computation for metrics in HoodieMetadataMetrics
Change Logs
For some of the metrics type like duration we are using incrementMetric instead of setMetric. Also some of the redundant metrics are removed. For example a count type metric has both count and duration metric getting pushed even though duration is not calculated. File lookup count metric is added for bloom filter and column stat
Impact
NA
Risk level (write none, low medium or high below)
low
Documentation Update
NA
Contributor's checklist
- [ ] Read through contributor's guide
- [ ] Change Logs and Impact were stated clearly
- [ ] Adequate tests were added if applicable
- [ ] CI passed
The PR is rebased on top of HUDI-7391
I guess this is stacked ontop of 10635. Can you add a link to PR description to the actual diff to review for this patch(ignoring the stacked PR changed). If I am not wrong, https://github.com/apache/hudi/pull/10641/files/adc183a351b8f15d671c0c6eefd1f999bed54774..fc072259ead8a9870a1b26b5aceb7882aabebb32 is the right link (last 2 commits).
reviewed last 2 commits. LGTM
hey @prashantwason : lets de-couple the fixes. a. Fixing MDT to emit writer side metrics(commit duration, compaction duration etc) b. Fixing MDT to emit reader side metrics (col stats look up duration etc) during distributed registry.
I feel we should focus on (a) in this patch and get it landed. and you can put out a patch (I assume you folks already have a fix) for distributed registry based metrics from the executor.
If you are aligned on that, let us know if you have any feedback on this patch. or if we are good to go ahead.
Sure. I can put out a patch for b. later. If you have tested the writer side metrics emitted as per this change for MDT then this is good to go.
On Mon, Apr 8, 2024 at 8:50 PM Sivabalan Narayanan @.***> wrote:
hey @prashantwason https://github.com/prashantwason : lets de-couple the fixes. a. Fixing MDT to emit writer side metrics(commit duration, compaction duration etc) b. Fixing MDT to emit reader side metrics (col stats look up duration etc) during distributed registry.
I feel we should focus on (a) in this patch and get it landed. and you can put out a patch (I assume you folks already have a fix) for distributed registry based metrics from the executor.
If you are aligned on that, let us know if you have any feedback on this patch. or if we are good to go ahead.
— Reply to this email directly, view it on GitHub https://github.com/apache/hudi/pull/10641#issuecomment-2044100016, or unsubscribe https://github.com/notifications/unsubscribe-auth/AN55SS3G7RSP3O7GKEUIZEDY4NQPJAVCNFSM6AAAAABC7SK2ZOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANBUGEYDAMBRGY . You are receiving this because you were mentioned.Message ID: @.***>
CI report:
- b86a5261475a9a24449bab2be28ced124587e242 Azure: SUCCESS
Bot commands
@hudi-bot supports the following commands:@hudi-bot run azurere-run the last Azure build
@yihua can we merge this PR? The CI is green now.