hive icon indicating copy to clipboard operation
hive copied to clipboard

HIVE-28062: Optimize get_partitions_by_names in direct sql

Open wecharyu opened this issue 1 year ago • 3 comments

What changes were proposed in this pull request?

  1. Add partition number check in get_partitions_by_names
  2. Skipping get partition ids in MetaStoreDirectSql#getPartitionsViaPartNames to reduce the Database calls.

Why are the changes needed?

  1. get_partitions_by_names also need check partition number in case the request number is too large
  2. remove unnecessary db calls can improve the performance.

Does this PR introduce any user-facing change?

No.

Is the change a dependency upgrade?

No.

How was this patch tested?

  1. Passing all existing tests.
  2. Run the getPartitionsByNames benchmark tests:
java -jar ./hmsbench-jar-with-dependencies.jar -H localhost --savedata /tmp/benchdata --sanitize -N 100 -N 10000 -o bench_results_direct.csv -C -d testbench_http --params=100  -E 'drop.*' -E 'renameTable.*' -E 'getTableObjectsByName.*' -E 'listTables.*' -E 'listPartitions.*' -E 'getPartitionNames.*' -E 'listPartition' -E 'getPartition' -E 'getPartitions' -E 'getPartitions.100' -E 'getPartitions.10000' -E 'getPartitionsByFilter.*' -E 'addPartition.*' -E 'addPartitions.*' -E 'alterPartitions.*' -E 'getNid' -E 'listDatabases' -E 'getTable' -E 'createTable' -E 'openTxn.*'
  • Before this patch:
Operation       Mean    Med     Min     Max     Err%
getPartitionsByNames    7.47374 7.31857 6.96766 10.2396 7.39022
getPartitionsByNames.100        14.4165 14.2468 13.6072 17.9526 4.67421
getPartitionsByNames.10000      524.812 524.127 504.923 555.049 2.02143
  • After this patch:
Operation       Mean    Med     Min     Max     Err%
getPartitionsByNames    7.11728 7.02464 6.78734 8.59082 4.32347
getPartitionsByNames.100        13.4596 13.3253 12.8197 16.3828 3.77006
getPartitionsByNames.10000      460.498 459.805 438.384 500.921 2.57908

wecharyu avatar Feb 06 '24 07:02 wecharyu

The failed tests seem to be related. http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-5063/1/tests

zhangbutao avatar Feb 19 '24 15:02 zhangbutao

The failed tests seem to be related. http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-5063/1/tests

@zhangbutao The issue arises due to the introduction of a partition number check in the getPartitionsByNames() API, fixed it in the client side.

wecharyu avatar Feb 21 '24 02:02 wecharyu

@zhangbutao could you take a look again?

wecharyu avatar Mar 11 '24 10:03 wecharyu

  • @dengzhhu653 @deniskuzZ @ayushtkn @saihemanth-cloudera may be interested in this improvement. :)

zhangbutao avatar Mar 12 '24 04:03 zhangbutao

Does this PR introduce any user-facing change? No

We didn't have a limit before, didn't we? Right now user might get an exception when the limit is exceeded

deniskuzZ avatar Mar 14 '24 10:03 deniskuzZ

http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-5063/13/tests/ org.apache.iceberg.mr.hive.TestConflictingDataFiles::testMultiFiltersUpdate I think this is a flaky test. I have seen this faliure many times. Run a flaky check to see results. http://ci.hive.apache.org/job/hive-flaky-check/831/

zhangbutao avatar Mar 27 '24 02:03 zhangbutao

http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-5063/13/tests/ org.apache.iceberg.mr.hive.TestConflictingDataFiles::testMultiFiltersUpdate I think this is a flaky test. I have seen this faliure many times. Run a flaky check to see results. http://ci.hive.apache.org/job/hive-flaky-check/831/

http://ci.hive.apache.org/job/hive-flaky-check/831/testReport/ test report shows that this is a flay test. Hi @simhadri-g , do you have any suggestion about this flaky test, as i see you wrote this code not long ago. Should we disable it now? I filed a ticket about the flay test: https://issues.apache.org/jira/browse/HIVE-28153

Thanks.

zhangbutao avatar Mar 27 '24 02:03 zhangbutao

Quality Gate Passed Quality Gate passed

Issues
14 New issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
No data about Duplication

See analysis details on SonarCloud

sonarqubecloud[bot] avatar Mar 27 '24 03:03 sonarqubecloud[bot]

Hi @zhangbutao , TestConflictingDataFiles is a concurrency test. It seems to be failing only with format version 1. I am working on checking why its flaky. Thanks!

simhadri-g avatar Mar 27 '24 09:03 simhadri-g