iceberg icon indicating copy to clipboard operation
iceberg copied to clipboard

Core: Add metrics reporter for serializable table

Open wangyinsheng opened this issue 2 years ago • 18 comments

The metrics reporter will be lost when a table be sent to other nodes in a cluster through serializable table, This mr rebuild the metrics reporter when the table is deserialized

wangyinsheng avatar Mar 19 '23 11:03 wangyinsheng

PTAL @nastra

wangyinsheng avatar Mar 20 '23 09:03 wangyinsheng

PTAL @nastra

wangyinsheng avatar Mar 28 '23 00:03 wangyinsheng

I did take another look and I don't think we need the changes in TestTables, because the tests will pass without those.

you're right, it not necessary. I used to compare the metadata between origin table and serializable table, then it will be needed. I have removed the changes, PTAL @nastra

wangyinsheng avatar Mar 28 '23 14:03 wangyinsheng

PTAL @aokolnychyi

wangyinsheng avatar Mar 30 '23 00:03 wangyinsheng

PTAL @aokolnychyi

wangyinsheng avatar Mar 31 '23 00:03 wangyinsheng

could you please take a look this pr, it has last for weeks @aokolnychyi

wangyinsheng avatar Apr 03 '23 13:04 wangyinsheng

@nastra Could you please approval the CI ? And is there someone else can review this pr?

wangyinsheng avatar Apr 04 '23 10:04 wangyinsheng

@aokolnychyi @danielcweeks could you take a look please ? thanks

wangyinsheng avatar Apr 10 '23 08:04 wangyinsheng

@wangyinsheng to fix CI failures I think you need to add the below code to TransactionTable:

    @Override
    public MetricsReporter metricsReporter() {
      return BaseTransaction.this.reporter;
    }

and then also this to BaseMetadataTable:

  @Override
  public MetricsReporter metricsReporter() {
    return table().metricsReporter();
  }

nastra avatar Apr 12 '23 09:04 nastra

@wangyinsheng I talked to @rdblue yesterday and we concluded that we don't want to carry over the catalog properties just for the metrics reporter. The better approach would be to make MetricsReporter extend Serializable. I opened https://github.com/wangyinsheng/iceberg/pull/1 against your branch to do that. Feel free to squash it into your changes

nastra avatar Apr 13 '23 11:04 nastra

@nastra I am confused why should't carry over the catalog properties for metrics reporter? We already support to config metrics report through catalog properites. And metrics reporter may contains non-serializable object, KafakProducer for example. We need properties to recreate non-serializable object after table been deserialized

wangyinsheng avatar Apr 14 '23 01:04 wangyinsheng

please take a look @nastra @rdblue @aokolnychyi @danielcweeks

wangyinsheng avatar Apr 21 '23 02:04 wangyinsheng

please take a look @nastra @rdblue @aokolnychyi @danielcweeks

wangyinsheng avatar Apr 24 '23 09:04 wangyinsheng

@wangyinsheng as mentioned previously, we still would like to make the MetricsReporter Serializable. For things that aren't Serializable in a metrics reporter, we would make them transient and then lazily initialize them

nastra avatar Apr 24 '23 10:04 nastra

@nastra, where are we with the fix for serialization here?

rdblue avatar Jun 04 '23 18:06 rdblue

@nastra, where are we with the fix for serialization here?

@rdblue I haven't heard back from @wangyinsheng after my comment above , so I've opened https://github.com/apache/iceberg/pull/7370 a while ago for initial experimentation and then updated it to fix the serialization issue.

nastra avatar Jun 06 '23 07:06 nastra

@nastra, where are we with the fix for serialization here?

@rdblue I haven't heard back from @wangyinsheng after my comment above , so I've opened #7370 a while ago for initial experimentation and then updated it to fix the serialization issue.

@nastra @rdblue Sorry,I am working on other things for a while. But I am still keep my opinion, I think the SerializableTable is the best place to handle the lazily initialization stuff for MetricsReporter, we should not just make MetricsReporter serializable, which will throw the lazily initialization stuff to any implemention class of MetricsReporter

wangyinsheng avatar Jun 12 '23 07:06 wangyinsheng

This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the [email protected] list. Thank you for your contributions.

github-actions[bot] avatar Aug 28 '24 00:08 github-actions[bot]

This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If you think that is incorrect, or the pull request requires review, you can revive the PR at any time.

github-actions[bot] avatar Sep 05 '24 00:09 github-actions[bot]