amoro icon indicating copy to clipboard operation
amoro copied to clipboard

[Improvement]: Avoid calling getMixedTablePartitionSpecById in the loop

Open 7hong opened this issue 1 year ago • 0 comments

Search before asking

  • [X] I have searched in the issues and found no similar issues.

What would you like to be improved?

In the Optimizing Plan phase, it is necessary to obtain the file's PartitionSpec. However, calling getMixedTablePartitionSpecById in a loop is a very expensive and unnecessary operation。

https://github.com/apache/amoro/blob/678b43c85347eb69d61e4f7ca016cb63d2ae56e4/amoro-ams/src/main/java/org/apache/amoro/server/optimizing/plan/OptimizingEvaluator.java#L119-L124

Especially in environments where Kerberos is enabled, repeatedly calling org.apache.iceberg.BaseTable#specs can incur lock overhead.

  • [ ]

How should we improve?

I want to place the code for obtaining PartitionSpec inside the TableFileScanHelper interface。

...
  PartitionSpec partitionSpec = tableFileScanHelper.getSpec(fileScanResult.file().specId());
...

Are you willing to submit PR?

  • [X] Yes I am willing to submit a PR!

Subtasks

No response

Code of Conduct

7hong avatar Oct 22 '24 07:10 7hong