amoro
amoro copied to clipboard
[Improvement]: Avoid calling getMixedTablePartitionSpecById in the loop
Search before asking
- [X] I have searched in the issues and found no similar issues.
What would you like to be improved?
In the Optimizing Plan phase, it is necessary to obtain the file's PartitionSpec. However, calling getMixedTablePartitionSpecById in a loop is a very expensive and unnecessary operation。
https://github.com/apache/amoro/blob/678b43c85347eb69d61e4f7ca016cb63d2ae56e4/amoro-ams/src/main/java/org/apache/amoro/server/optimizing/plan/OptimizingEvaluator.java#L119-L124
Especially in environments where Kerberos is enabled, repeatedly calling org.apache.iceberg.BaseTable#specs can incur lock overhead.
- [ ]
How should we improve?
I want to place the code for obtaining PartitionSpec inside the TableFileScanHelper interface。
...
PartitionSpec partitionSpec = tableFileScanHelper.getSpec(fileScanResult.file().specId());
...
Are you willing to submit PR?
- [X] Yes I am willing to submit a PR!
Subtasks
No response
Code of Conduct
- [X] I agree to follow this project's Code of Conduct