gazelle_plugin
gazelle_plugin copied to clipboard
Physical memory limits appear when the grouping sets operator is processing
Describe the bug
When the grouping sets operator is processing, the data will be multiplied, which will cause memory overflow. However, we observed that spark does not have this problem under the same resources. We also found that NSE does not spill to disk when processing the columnarExpand operator, while spark will spill to disk when processing the Expand operator.The following figure is an error report:
The following figure shows our verification results:
To Reproduce
our sql are as follows:
select sum(id),name,age from test_test_col3 group by GROUPING SETS( (name), (name,age), (age), () );
Hi @ziyangRen
Indeed , the HashAgg impl in Gazelle does not support spill yet.
-yuan