hive icon indicating copy to clipboard operation
hive copied to clipboard

HIVE-27751 Log Query Compilation summary in an accumulated way

Open ramesh0201 opened this issue 2 years ago • 5 comments

What changes were proposed in this pull request?

We are measuring and accumulating various stages in the Query compilation.

Why are the changes needed?

These changes are useful for reading and collecting all the measures of compile time in a single place. It is also useful in debugging a performance issue in the query compilation phase and also to report and compare with various runs

Does this PR introduce any user-facing change?

No

Is the change a dependency upgrade?

No

How was this patch tested?

Tested in the local for any q file with the config added

ramesh0201 avatar Sep 28 '23 09:09 ramesh0201

It could be a good idea to merge this and https://github.com/apache/hive/pull/4749 .

henrib avatar Sep 28 '23 16:09 henrib

In order to run test this. Please set the config hive.compile.print.summary to true in any q file and run the test to see the Query Compilation Summary in the logs. One example of the ouput is below. The order of operations are maintained while print the summary too:

Query Compilation Summary
----------------------------------------------------------------------------------------------
waitCompile                                                                                              0 ms
parse                                                                                                    4 ms
getTableConstraints - HS2-cache                                                                         69 ms
optimizer - Calcite: Plan generation                                                                   257 ms
optimizer - Calcite: Prejoin ordering transformation                                                    20 ms
optimizer - Calcite: Postjoin ordering transformation                                                   24 ms
optimizer                                                                                              705 ms
optimizer - HiveOpConverterPostProc                                                                      0 ms
optimizer - Generator                                                                                   24 ms
optimizer - PartitionColumnsSeparator                                                                    1 ms
optimizer - SyntheticJoinPredicate                                                                       2 ms
optimizer - SimplePredicatePushDown                                                                      8 ms
optimizer - RedundantDynamicPruningConditionsRemoval                                                     0 ms
optimizer - SortedDynPartitionTimeGranularityOptimizer                                                   2 ms
optimizer - PartitionPruner                                                                              3 ms
optimizer - PartitionConditionRemover                                                                    2 ms
optimizer - GroupByOptimizer                                                                             2 ms
optimizer - ColumnPruner                                                                                10 ms
optimizer - CountDistinctRewriteProc                                                                     1 ms
optimizer - SamplePruner                                                                                 1 ms
optimizer - MapJoinProcessor                                                                             2 ms
optimizer - BucketingSortingReduceSinkOptimizer                                                          2 ms
optimizer - UnionProcessor                                                                               2 ms
optimizer - JoinReorder                                                                                  0 ms
optimizer - FixedBucketPruningOptimizer                                                                  2 ms
optimizer - BucketVersionPopulator                                                                       2 ms
optimizer - NonBlockingOpDeDupProc                                                                       1 ms
optimizer - IdentityProjectRemover                                                                       0 ms
optimizer - LimitPushdownOptimizer                                                                       2 ms
optimizer - OrderlessLimitPushDownOptimizer                                                              1 ms
optimizer - StatsOptimizer                                                                               0 ms
optimizer - SimpleFetchOptimizer                                                                         0 ms
TezCompiler - Run top n key optimization                                                                 2 ms
TezCompiler - Setup dynamic partition pruning                                                            3 ms
optimizer - Merge single column semi-join reducers to composite                                          0 ms
partition-retrieving                                                                                     1 ms
TezCompiler - Setup stats in the operator plan                                                          78 ms
TezCompiler - Sorted dynamic partition optimization                                                      3 ms
TezCompiler - Reduce Sink de-duplication                                                                 4 ms
TezCompiler - Run the optimizations that use stats for optimization                                      5 ms
TezCompiler - Run reduce sink after join algorithm selection                                             2 ms
TezCompiler - Run remove dynamic pruning by size                                                         1 ms
TezCompiler - Run cycle analysis for partition pruning                                                   0 ms
TezCompiler - Remove redundant semijoin reduction                                                        1 ms
TezCompiler - Shared scans optimization                                                                 13 ms
TezCompiler - markOperatorsWithUnstableRuntimeStats                                                      1 ms
TezCompiler - generateTaskTree                                                                          31 ms
TezCompiler - optimizeTaskPlan                                                                         156 ms
TezCompiler                                                                                            323 ms
semanticAnalyze                                                                                       2628 ms
compile                                                                                               2633 ms
----------------------------------------------------------------------------------------------

ramesh0201 avatar Oct 27 '23 23:10 ramesh0201

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 13 Code Smells

No Coverage information No Coverage information
No Duplication information No Duplication information

warning The version of Java (11.0.8) you have used to run this analysis is deprecated and we will stop accepting it soon. Please update to at least Java 17. Read more here

sonarqubecloud[bot] avatar Oct 27 '23 23:10 sonarqubecloud[bot]

Quality Gate Passed Quality Gate passed

The SonarCloud Quality Gate passed, but some issues were introduced.

16 New issues
0 Security Hotspots
No data about Coverage
No data about Duplication

See analysis details on SonarCloud

sonarqubecloud[bot] avatar Jan 10 '24 06:01 sonarqubecloud[bot]

hi @zabetak , The PR is modified to support a summary that illustrates the sub steps in the compilation phase and not worry about reporting the method calls in the summary. Since the current design of PerfLogger doesn't intend to measure function calls multiple times or even recursive function calls. Can you please review this new version?

ramesh0201 avatar Feb 27 '24 16:02 ramesh0201

@zabetak Can you look at the latest patch?

ramesh0201 avatar Mar 22 '24 04:03 ramesh0201

Quality Gate Passed Quality Gate passed

Issues
21 New issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
No data about Duplication

See analysis details on SonarCloud

sonarqubecloud[bot] avatar Mar 23 '24 01:03 sonarqubecloud[bot]

Thanks @zabetak for the review

ramesh0201 avatar Mar 23 '24 05:03 ramesh0201