trino icon indicating copy to clipboard operation
trino copied to clipboard

Translate dynamic filter to compiled filter

Open yikf opened this issue 3 years ago • 3 comments

Currently, Trino support dynamic filter, After dynamic operator collect domain, probe side scan push the domain down to datasource, filtering some data through the index(etc. parquet/orc footer), if these domain hit ratio is not high, domain failure is possible.

Let's say have a parquet file, it has one rowgroup and col min/max is 0/100, if domain value set is 10-20, domain failure(because not filter any rowgroup), data participating in the join operation is not reduced.

In addition to broadcast Join, other dynamic filters come from coordinators and is sent to the worker along with the taskCeatetOrUpdate, have a idea is that: After the worker receives the task, if the dynamicFilter is not empty, we immediately register it and collect it to localCollector. Then, when visitScanAndFilter, we obtain the domain and convert it to static filter, so as to filter the data participating in join operation.

yikf avatar Jul 22 '22 02:07 yikf

@findepi what about this idea?

yikf avatar Jul 22 '22 02:07 yikf

cc: @raunaqmorarka

hashhar avatar Jul 22 '22 04:07 hashhar

I have a WIP PR https://github.com/trinodb/trino/pull/5204 about it which I'm planning to revive and land in near future

raunaqmorarka avatar Jul 22 '22 05:07 raunaqmorarka