dolt icon indicating copy to clipboard operation
dolt copied to clipboard

indexed table access with IN predicate on primary key

Open max-hoffman opened this issue 3 years ago • 6 comments

re zach: "When we push down an IN clause to the table as an index, we should in principle remove that predicate from the filter. Not having the filter at all is even better than having it fast, and should work for IN expressions."

Filter(mytable.i IN (1, 2, 3, 4))
  └─ Projected table access on [i s]
         └─ IndexedTableAccess(mytable on [mytable.i])

We do not need to traverse every row in mytable, just point lookup the values in the filter.

max-hoffman avatar Nov 11 '21 15:11 max-hoffman

Max, please close this out when your work with removing filters on index lookups is done

zachmu avatar Feb 24 '22 17:02 zachmu

This is still broken as of 0.50.8, pending a pushdown rewrite.

max-hoffman avatar Dec 01 '22 17:12 max-hoffman

repro:

> create table xy (x int primary key, y int);
> explain select * from xy where x in (1,2,3);
+-------------------------------------------------+
| plan                                            |
+-------------------------------------------------+
| Filter(xy.x HASH IN (1, 2, 3))                  |
|  └─ IndexedTableAccess(xy)                      |
|      ├─ index: [xy.x]                           |
|      ├─ filters: [{[2, 2]}, {[3, 3]}, {[1, 1]}] |
|      └─ columns: [x y]                          |
+-------------------------------------------------+

max-hoffman avatar Dec 01 '22 17:12 max-hoffman

Regarding point lookups... there's an interesting edge case with MySQL's automatic coercion that's worth being aware of...

mysql> select * from test01;
+--------+
| pk     |
+--------+
| 11d    |
| 11wha? |
| 11     |
| 12     |
+--------+
4 rows in set (0.01 sec)

mysql> select * from test01 where pk in (11);
+--------+
| pk     |
+--------+
| 11d    |
| 11wha? |
| 11     |
+--------+
3 rows in set, 1 warning (0.00 sec)

Note that a point lookup wouldn't return the same results as MySQL, because of MySQL's automatic coercion for strings and floats/doubles. If we can tell that the type has to be coerced to be compared with the column, then to match MySQL's behavior, we'd need to scan.

fulghum avatar Dec 01 '22 18:12 fulghum

This problem still exists, also the new issue jason pointed out. Mysql still leaves filters in a lot of plans after pushdown, at least for string types. More investigation needed to determine when we can be more aggressive eliminating filters during pushdown.

max-hoffman avatar Feb 10 '23 00:02 max-hoffman

This still exists in 1.35.9, related to all of the typing work @jycor has been doing.

max-hoffman avatar Apr 15 '24 23:04 max-hoffman