spark-redis
spark-redis copied to clipboard
"Keys" command is super slow
When I do keys MYTABLE:* through cli, it takes 17seconds.
When I do spark.sparkContext.fromRedisKeyPattern(keyPattern = s"$MYTABLE:*") it still not complete after 10min.
Why is there such a huge discrepancy ?
Hi @leobenkel ,
fromRedisKeyPattern() uses SCAN internally. How many keys do you have in total and how many match your pattern? Does it work in general with a smaller number of keys or just hangs? What is your redis and spark cluster size? Is there anything in the logs?
It works when it is small but as I add more keys it got slower and slower. I switched to https://github.com/debasishg/scala-redis to be able to use keys and then spark.sparkContext.parallelize(keys). It now takes 50seconds to complete