gazelle_plugin icon indicating copy to clipboard operation
gazelle_plugin copied to clipboard

Physical memory limits appear when the grouping sets operator is processing

Open ziyangRen opened this issue 2 years ago • 1 comments

Describe the bug When the grouping sets operator is processing, the data will be multiplied, which will cause memory overflow. However, we observed that spark does not have this problem under the same resources. We also found that NSE does not spill to disk when processing the columnarExpand operator, while spark will spill to disk when processing the Expand operator.The following figure is an error report: image The following figure shows our verification results: image

To Reproduce our sql are as follows: select sum(id),name,age from test_test_col3 group by GROUPING SETS( (name), (name,age), (age), () );

ziyangRen avatar Aug 26 '22 02:08 ziyangRen

Hi @ziyangRen

Indeed , the HashAgg impl in Gazelle does not support spill yet.

-yuan

zhouyuan avatar Sep 07 '22 23:09 zhouyuan