hudi icon indicating copy to clipboard operation
hudi copied to clipboard

[HUDI-7395] Fix computation for metrics in HoodieMetadataMetrics

Open lokeshj1703 opened this issue 1 year ago • 6 comments

Change Logs

For some of the metrics type like duration we are using incrementMetric instead of setMetric. Also some of the redundant metrics are removed. For example a count type metric has both count and duration metric getting pushed even though duration is not calculated. File lookup count metric is added for bloom filter and column stat

Impact

NA

Risk level (write none, low medium or high below)

low

Documentation Update

NA

Contributor's checklist

  • [ ] Read through contributor's guide
  • [ ] Change Logs and Impact were stated clearly
  • [ ] Adequate tests were added if applicable
  • [ ] CI passed

lokeshj1703 avatar Feb 08 '24 11:02 lokeshj1703

The PR is rebased on top of HUDI-7391

lokeshj1703 avatar Feb 08 '24 12:02 lokeshj1703

I guess this is stacked ontop of 10635. Can you add a link to PR description to the actual diff to review for this patch(ignoring the stacked PR changed). If I am not wrong, https://github.com/apache/hudi/pull/10641/files/adc183a351b8f15d671c0c6eefd1f999bed54774..fc072259ead8a9870a1b26b5aceb7882aabebb32 is the right link (last 2 commits).

nsivabalan avatar Apr 07 '24 14:04 nsivabalan

reviewed last 2 commits. LGTM

nsivabalan avatar Apr 07 '24 14:04 nsivabalan

hey @prashantwason : lets de-couple the fixes. a. Fixing MDT to emit writer side metrics(commit duration, compaction duration etc) b. Fixing MDT to emit reader side metrics (col stats look up duration etc) during distributed registry.

I feel we should focus on (a) in this patch and get it landed. and you can put out a patch (I assume you folks already have a fix) for distributed registry based metrics from the executor.

If you are aligned on that, let us know if you have any feedback on this patch. or if we are good to go ahead.

nsivabalan avatar Apr 09 '24 03:04 nsivabalan

Sure. I can put out a patch for b. later. If you have tested the writer side metrics emitted as per this change for MDT then this is good to go.

On Mon, Apr 8, 2024 at 8:50 PM Sivabalan Narayanan @.***> wrote:

hey @prashantwason https://github.com/prashantwason : lets de-couple the fixes. a. Fixing MDT to emit writer side metrics(commit duration, compaction duration etc) b. Fixing MDT to emit reader side metrics (col stats look up duration etc) during distributed registry.

I feel we should focus on (a) in this patch and get it landed. and you can put out a patch (I assume you folks already have a fix) for distributed registry based metrics from the executor.

If you are aligned on that, let us know if you have any feedback on this patch. or if we are good to go ahead.

— Reply to this email directly, view it on GitHub https://github.com/apache/hudi/pull/10641#issuecomment-2044100016, or unsubscribe https://github.com/notifications/unsubscribe-auth/AN55SS3G7RSP3O7GKEUIZEDY4NQPJAVCNFSM6AAAAABC7SK2ZOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANBUGEYDAMBRGY . You are receiving this because you were mentioned.Message ID: @.***>

prashantwason avatar Apr 09 '24 20:04 prashantwason

CI report:

  • b86a5261475a9a24449bab2be28ced124587e242 Azure: SUCCESS
Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

hudi-bot avatar May 28 '24 20:05 hudi-bot

@yihua can we merge this PR? The CI is green now.

lokeshj1703 avatar May 29 '24 05:05 lokeshj1703