FiloDB
FiloDB copied to clipboard
Ability to merge ranges and create a larger token range to reduce number of tasks
FiloDB is very good at parallelism and performance, but if someone wants to increase concurrency they should have options to control number of tasks.
Please refer to the link for spark connector code on how to control the number of tasks getting created during full table scan.
https://github.com/datastax/spark-cassandra-connector/blob/master/spark-cassandra-connector/src/main/scala/com/datastax/spark/connector/rdd/partitioner/CassandraPartitionGenerator.scala