FiloDB icon indicating copy to clipboard operation
FiloDB copied to clipboard

IN optimization and controlling task size during multipartition scan

Open parekuti opened this issue 8 years ago • 1 comments

Currently there is a limit on how may partitions are supported during multipartition scan. If we increase the limit then will degrade the performance. Can we start thinking about how far we can go without degrading performance or cause issues? Also can we have a plan to add more tasks to get more cores during multipartition scan. For example => 0-200 or new limit --> default plan New limit 400 --> same plan.. Some how create more tasks but fewer tasks than 5000 400 --> full table scan default behavior

parekuti avatar Feb 22 '17 14:02 parekuti

Basically multi partition queries always runs on one Spark partition. WE want to enable bigger multi partition queries which can spread to multiple Spark partitions without invoking filtered full table scans. This will require some intelligent logic.

velvia avatar Feb 24 '17 00:02 velvia