HIVE-27751 Log Query Compilation summary in an accumulated way
What changes were proposed in this pull request?
We are measuring and accumulating various stages in the Query compilation.
Why are the changes needed?
These changes are useful for reading and collecting all the measures of compile time in a single place. It is also useful in debugging a performance issue in the query compilation phase and also to report and compare with various runs
Does this PR introduce any user-facing change?
No
Is the change a dependency upgrade?
No
How was this patch tested?
Tested in the local for any q file with the config added
It could be a good idea to merge this and https://github.com/apache/hive/pull/4749 .
In order to run test this. Please set the config hive.compile.print.summary to true in any q file and run the test to see the Query Compilation Summary in the logs. One example of the ouput is below. The order of operations are maintained while print the summary too:
Query Compilation Summary
----------------------------------------------------------------------------------------------
waitCompile 0 ms
parse 4 ms
getTableConstraints - HS2-cache 69 ms
optimizer - Calcite: Plan generation 257 ms
optimizer - Calcite: Prejoin ordering transformation 20 ms
optimizer - Calcite: Postjoin ordering transformation 24 ms
optimizer 705 ms
optimizer - HiveOpConverterPostProc 0 ms
optimizer - Generator 24 ms
optimizer - PartitionColumnsSeparator 1 ms
optimizer - SyntheticJoinPredicate 2 ms
optimizer - SimplePredicatePushDown 8 ms
optimizer - RedundantDynamicPruningConditionsRemoval 0 ms
optimizer - SortedDynPartitionTimeGranularityOptimizer 2 ms
optimizer - PartitionPruner 3 ms
optimizer - PartitionConditionRemover 2 ms
optimizer - GroupByOptimizer 2 ms
optimizer - ColumnPruner 10 ms
optimizer - CountDistinctRewriteProc 1 ms
optimizer - SamplePruner 1 ms
optimizer - MapJoinProcessor 2 ms
optimizer - BucketingSortingReduceSinkOptimizer 2 ms
optimizer - UnionProcessor 2 ms
optimizer - JoinReorder 0 ms
optimizer - FixedBucketPruningOptimizer 2 ms
optimizer - BucketVersionPopulator 2 ms
optimizer - NonBlockingOpDeDupProc 1 ms
optimizer - IdentityProjectRemover 0 ms
optimizer - LimitPushdownOptimizer 2 ms
optimizer - OrderlessLimitPushDownOptimizer 1 ms
optimizer - StatsOptimizer 0 ms
optimizer - SimpleFetchOptimizer 0 ms
TezCompiler - Run top n key optimization 2 ms
TezCompiler - Setup dynamic partition pruning 3 ms
optimizer - Merge single column semi-join reducers to composite 0 ms
partition-retrieving 1 ms
TezCompiler - Setup stats in the operator plan 78 ms
TezCompiler - Sorted dynamic partition optimization 3 ms
TezCompiler - Reduce Sink de-duplication 4 ms
TezCompiler - Run the optimizations that use stats for optimization 5 ms
TezCompiler - Run reduce sink after join algorithm selection 2 ms
TezCompiler - Run remove dynamic pruning by size 1 ms
TezCompiler - Run cycle analysis for partition pruning 0 ms
TezCompiler - Remove redundant semijoin reduction 1 ms
TezCompiler - Shared scans optimization 13 ms
TezCompiler - markOperatorsWithUnstableRuntimeStats 1 ms
TezCompiler - generateTaskTree 31 ms
TezCompiler - optimizeTaskPlan 156 ms
TezCompiler 323 ms
semanticAnalyze 2628 ms
compile 2633 ms
----------------------------------------------------------------------------------------------
Kudos, SonarCloud Quality Gate passed! 
0 Bugs
0 Vulnerabilities
0 Security Hotspots
13 Code Smells
No Coverage information
No Duplication information
The version of Java (11.0.8) you have used to run this analysis is deprecated and we will stop accepting it soon. Please update to at least Java 17.
Read more here
Quality Gate passed
The SonarCloud Quality Gate passed, but some issues were introduced.
16 New issues
0 Security Hotspots
No data about Coverage
No data about Duplication
hi @zabetak , The PR is modified to support a summary that illustrates the sub steps in the compilation phase and not worry about reporting the method calls in the summary. Since the current design of PerfLogger doesn't intend to measure function calls multiple times or even recursive function calls. Can you please review this new version?
@zabetak Can you look at the latest patch?
Quality Gate passed
Issues
21 New issues
0 Accepted issues
Measures
0 Security Hotspots
No data about Coverage
No data about Duplication
Thanks @zabetak for the review