hive icon indicating copy to clipboard operation
hive copied to clipboard

HIVE-24167: TPC-DS query 14 fails while generating plan for the filter

Open okumin opened this issue 2 years ago • 10 comments

What changes were proposed in this pull request?

Link Calcite's RelNodes with the same RelTreeSignature. https://issues.apache.org/jira/browse/HIVE-24167

Why are the changes needed?

It is known that the current unification flow could fail when CTEs are materialized. I summarized the problem in this gist.

This PR implements the option "Combine RelNodes with RelTreeSignature" in the gist.

See also the PR where I implemented another option, "Don't pull groups of Operators by signatures".

Does this PR introduce any user-facing change?

No.

Is the change a dependency upgrade?

No.

How was this patch tested?

I added a new test case to reproduce the issue and restore query14 of TPC-DS which fails on the current master.

okumin avatar Feb 09 '24 10:02 okumin

Quality Gate Passed Quality Gate passed

Issues
5 New issues

Measures
0 Security Hotspots
No data about Coverage
No data about Duplication

See analysis details on SonarCloud

sonarqubecloud[bot] avatar Feb 11 '24 08:02 sonarqubecloud[bot]

Sorry, I think I canceled the Draft label by mistake when I was browsing the web...

zhangbutao avatar Feb 15 '24 00:02 zhangbutao

I applied two changes. I'm waiting for CI to finish.

  1. Rebased this branch based on the latest master
  2. Added https://github.com/apache/hive/pull/5077/commits/b11e3f88fef2e27934dc8389b8d1e80b6da5fc98
    • Now, we can populate correct stats of materialized tables thanks to #5089

okumin avatar Apr 08 '24 12:04 okumin

Quality Gate Passed Quality Gate passed

Issues
5 New issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
No data about Duplication

See analysis details on SonarCloud

sonarqubecloud[bot] avatar Apr 19 '24 16:04 sonarqubecloud[bot]

@zabetak

For instance running cbo_query23.q with the above properties set raises an exception.

Thanks. Predicate Push-Down and Partition Condition Remover cause the problem. I modified it on my local machine, and I will explain the details later.

Also it would be nice to do a sanity run by setting the above properties temporarily for all qtests to see if the problem disappears completely or if there are still cases that need to be handled.

Thanks for the great suggestion! I definitely will do it. Now, I have run positive LLAP tests and TPC-DS, and I found that No. 64 of TPD-DS fails. I created another ticket because I found the problem can happen regardless of materialization or PlamMapper while I was minimizing the test query. I would be glad if the community could take a look. https://github.com/apache/hive/pull/5207

I'm still testing the remaining qtests where CTEs are involved.

okumin avatar Apr 22 '24 00:04 okumin

I also think the current one could fail somewhere especially if we follow minor options. Now, I have achieved some additional experience and knowledge, and may come up with the to-be approach. I will work on it tomorrow(I am unavailable today because we host a tech event including a Hive 4 session in my city!) if I don't see very viable options here.

P.S. checking the intention and purposes of hive.query.planmapper.link.relnodes

okumin avatar Apr 24 '24 01:04 okumin

While running negative tests with hive.optimize.cte.materialize.threshold=1 and hive.optimize.cte.materialize.full.aggregate.only=false, I found another error. I created another ticket because it is also unrelated to the PlanMapper. https://github.com/apache/hive/pull/5232

okumin avatar May 03 '24 05:05 okumin

What's the current status of this PR? it looks like we have some problems with the testing.

dengzhhu653 avatar Jun 22 '24 07:06 dengzhhu653

@dengzhhu653 @zabetak Thanks. I understand we need some more consensus on the expectations of HIVE-24167.

We know the current PR doesn't work with some sets of Hive parameters. I have a question here, "Should HIVE-24167 let any SQLs with any set of parameters successfully run?". If the answer is yes, I recommend avoiding the current approach "let the entire query crush" and fixing known problems incrementally in the following tickets because it sounds hard to immediately find the perfect solution. If the answer is no, we may need a consensus on what queries/parameters we should support in HIVE-24167, e.g. we prioritize a set of default parameters.

okumin avatar Jun 24 '24 02:06 okumin