Wenchen Fan
Wenchen Fan
Can we write down the SQL spec for this syntax in the PR description? To make it easier for people to review the syntax and understand the semantic.
can you briefly explain your idea? Do you keep a range for each column and update the range when seeing a comparison? Then use the range to update the column...
cc @gengliangwang
It seems the python linter is broken in GA, cc @HyukjinKwon ``` ImportError: cannot import name '_unicodefun' from 'click' (/usr/local/lib/python3.9/dist-packages/click/__init__.py) Please run 'dev/reformat-python' script. ```
thanks, merging to master!
I'm a bit confused. After this PR, what's the difference between `SizeInBytesOnlyStatsPlanVisitor` and `BasicStatsPlanVisitor`?
Maybe we should name them `BasicStatesPlanVisitor` and `AdvancedStatsPlanVisitor`. We also need to make sure the updated `SizeInBytesOnlyStatsPlanVisitor` can propagate row count properly in all cases. BTW, with CBO off, where...
OK I think the idea makes sense. With CBO off, the optimizer/planner only needs size in bytes, but row count is also an important statistics to estimate size in bytes,...
cc @wzhfy @c21 can you take a look first?
In general, this feature looks reasonable, but it's interesting to discuss the behavior of "v2 write required distribution" with this new feature. Let's assume the required distribution is `ClusteredDistribution`, its...