hive HIVE-28363: Improve heuristics of FilterStatsRule without column stats

What changes were proposed in this pull request?

Make FilterStatsRule reduce the number of filtered rows by half when # of distinct values is empty. https://issues.apache.org/jira/browse/HIVE-28363

Why are the changes needed?

The current algorithm easily estimates the selectivity to be 100%. I believe it is not the best in most cases. FilterStatsRule roughly estimates the number of rows filtered by IN to be {Original # of rows} * {1 / cardinality} * {# of values in IN}. The second term is estimated as 0.5 when column stats are unavailable. So, it always returns the original number when IN retains two or more constant values like col IN (1, 3).

Maybe, FilterStatsRule had behaved in the same way as this PR before, but this change slightly changed the formula to cover a special case. We will likely prefer the original formula when columns stats are not available.

Does this PR introduce any user-facing change?

No.

Is the change a dependency upgrade?

No.

How was this patch tested?

CREATE TABLE users (id INT);
INSERT INTO users VALUES (1), (2), (3), (4), (5), (6), (7), (8), (9), (10);
set hive.fetch.task.conversion=none;
set hive.stats.fetch.column.stats=false;
EXPLAIN SELECT * FROM users WHERE id IN (1);
EXPLAIN SELECT * FROM users WHERE id IN (1, 2);