hive icon indicating copy to clipboard operation
hive copied to clipboard

HIVE-28548: Subdivide memory size allocated to parallel operators

Open okumin opened this issue 1 year ago • 5 comments

What changes were proposed in this pull request?

Let each operator know how much it can use at maximum.

Why are the changes needed?

We observed OOM happens when SharedWorkOptimizer merges heavy operators such as GroupByOperators with a map-side hash aggregation. I guess it is hard for GroupByOperator to control its memory correctly in that case.

I confirmed Tez has a similar feature to reallocate memory when a task is connected to multiple edges. https://github.com/apache/tez/blob/rel/release-0.10.4/tez-runtime-library/src/main/java/org/apache/tez/runtime/library/resources/WeightedScalingMemoryDistributor.java

https://issues.apache.org/jira/browse/HIVE-28548

I also found it is still problematic when # of merged operators is huge, e.g. 100, because the assignment per operator gets tiny. I will handle such case in HIVE-28549.

Does this PR introduce any user-facing change?

No.

Is the change a dependency upgrade?

No.

How was this patch tested?

Checked local logs.

--! qt:dataset:src

set hive.auto.convert.join=true;

SELECT *
FROM (SELECT key, count(*) AS num FROM src WHERE key LIKE '%1%' GROUP BY key) t1
LEFT OUTER JOIN (SELECT key, count(*) AS num FROM src WHERE key LIKE '%2%' GROUP BY key) t2 ON t1.key = t2.key
LEFT OUTER JOIN (SELECT key, count(*) AS num FROM src WHERE key LIKE '%3%' GROUP BY key) t3 ON t1.key = t3.key
LEFT OUTER JOIN (SELECT key, count(*) AS num FROM src WHERE key LIKE '%4%' GROUP BY key) t4 ON t1.key = t4.key
LEFT OUTER JOIN (SELECT key, count(*) AS num FROM src WHERE key LIKE '%5%' GROUP BY key) t5 ON t1.key = t5.key;

set hive.vectorized.execution.enabled=false;

SELECT *
FROM (SELECT key, count(*) AS num FROM src WHERE key LIKE '%1%' GROUP BY key) t1
LEFT OUTER JOIN (SELECT key, count(*) AS num FROM src WHERE key LIKE '%2%' GROUP BY key) t2 ON t1.key = t2.key
LEFT OUTER JOIN (SELECT key, count(*) AS num FROM src WHERE key LIKE '%3%' GROUP BY key) t3 ON t1.key = t3.key
LEFT OUTER JOIN (SELECT key, count(*) AS num FROM src WHERE key LIKE '%4%' GROUP BY key) t4 ON t1.key = t4.key
LEFT OUTER JOIN (SELECT key, count(*) AS num FROM src WHERE key LIKE '%5%' GROUP BY key) t5 ON t1.key = t5.key;

The total memory reduced from 358MB to 71MB.

% cat org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver-output.txt | grep OperatorUtils | grep Assigning
2024-09-30T23:54:38,486  INFO [TezTR-672468_1_3_0_0_0] exec.OperatorUtils: Assigning 75161926 bytes to 5 operators
2024-09-30T23:54:38,831  INFO [TezTR-672468_1_3_0_0_1] exec.OperatorUtils: Assigning 75161926 bytes to 5 operators
2024-09-30T23:54:42,015  INFO [TezTR-672468_1_4_0_0_0] exec.OperatorUtils: Assigning 75161926 bytes to 5 operators
2024-09-30T23:54:42,310  INFO [TezTR-672468_1_4_0_0_1] exec.OperatorUtils: Assigning 75161926 bytes to 5 operators
% cat org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver-output.txt | grep GroupByOperator | grep 'Max hash table'
2024-09-30T23:54:36,280  INFO [TezTR-672468_1_2_0_0_0] exec.GroupByOperator: Max hash table memory: 179.20MB (358.40MB * 0.5)
2024-09-30T23:54:42,015  INFO [TezTR-672468_1_4_0_0_0] exec.GroupByOperator: Max hash table memory: 35.84MB (71.68MB * 0.5)
2024-09-30T23:54:42,017  INFO [TezTR-672468_1_4_0_0_0] exec.GroupByOperator: Max hash table memory: 35.84MB (71.68MB * 0.5)
2024-09-30T23:54:42,018  INFO [TezTR-672468_1_4_0_0_0] exec.GroupByOperator: Max hash table memory: 35.84MB (71.68MB * 0.5)
2024-09-30T23:54:42,018  INFO [TezTR-672468_1_4_0_0_0] exec.GroupByOperator: Max hash table memory: 35.84MB (71.68MB * 0.5)
...
% cat org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver-output.txt | grep VectorGroupByOperator | grep 'GBY memory limits'
2024-09-30T23:54:38,491  INFO [TezTR-672468_1_3_0_0_0] vector.VectorGroupByOperator: GBY memory limits - isTez: true isLlap: true maxHashTblMemory: 64.51MB (71.68MB * 0.9) fixSize:660 (key:472 agg:144)
2024-09-30T23:54:38,499  INFO [TezTR-672468_1_3_0_0_0] vector.VectorGroupByOperator: GBY memory limits - isTez: true isLlap: true maxHashTblMemory: 64.51MB (71.68MB * 0.9) fixSize:660 (key:472 agg:144)
2024-09-30T23:54:38,500  INFO [TezTR-672468_1_3_0_0_0] vector.VectorGroupByOperator: GBY memory limits - isTez: true isLlap: true maxHashTblMemory: 64.51MB (71.68MB * 0.9) fixSize:660 (key:472 agg:144)
2024-09-30T23:54:38,500  INFO [TezTR-672468_1_3_0_0_0] vector.VectorGroupByOperator: GBY memory limits - isTez: true isLlap: true maxHashTblMemory: 64.51MB (71.68MB * 0.9) fixSize:660 (key:472 agg:144)
...

okumin avatar Oct 01 '24 07:10 okumin

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Feel free to reach out on the [email protected] list if the patch is in need of reviews.

github-actions[bot] avatar Jan 05 '25 00:01 github-actions[bot]

Just for your information, we tested this patch using TPC-DS queries 1-25 on a 10TB dataset. The total execution times were similar, though query 17 became slower (54s → 94s), while query 5 improved (142s → 101s). I haven't investigated the differences in detail, so they could be due to factors like stragglers or other irrelevant issues. Overall, I think the total execution times show that the patch doesn’t introduce any significant performance problems in typical cases, and the patch seems like a reasonable change to me.

ngsg avatar Mar 28 '25 12:03 ngsg

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Feel free to reach out on the [email protected] list if the patch is in need of reviews.

github-actions[bot] avatar Jun 03 '25 00:06 github-actions[bot]

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Feel free to reach out on the [email protected] list if the patch is in need of reviews.

github-actions[bot] avatar Aug 11 '25 00:08 github-actions[bot]

@ngsg @deniskuzZ @abstractdog Do you think this approach is reasonable? If the answer is yes, I will rebase this branch

okumin avatar Aug 11 '25 07:08 okumin

I think this is a reasonable approach.

Regarding the isTez/isLlap issue in VectorGroupByOperator, I'm inclined to use isTez and drop isLlap, as VectorGroupByOperator checks overall heap usage via gcCanary, not MemoryMXBean#getHeapMemoryUsage and memoryThreshold. However, I'm not certain whether Tez without LLAP also respects getConf().getMaxMemoryAvailable(), so it would be nice if others could confirm this as well.

ngsg avatar Aug 12 '25 03:08 ngsg

I rebased the branch and CI is green. I'm open to any suggestions

okumin avatar Aug 15 '25 05:08 okumin