blinkdb
blinkdb copied to clipboard
integration of BlinkDB directly with Spark
Hi,
We are in the process of evaluating BlinkDB for supporting interactive count(*) queries on a table with various filters. As per https://github.com/sameeragarwal/blinkdb/wiki/Running-BlinkDB-on-a-Cluster , we need to "Copy the Spark and BlinkDB directories to slaves" But from few talks i saw for BlinkDB, looks like only spark client needs to be modified, so that it modifies query plan to use sample tables instead of original table. That means it is necessary for us to build & install BlinkDB on only machine. Please correct me if i am wrong about this.
Since our organization is huge, it would be difficult to ask infrastructure team to apply patch on spark for supporting BlinkDB. We are using Spark 2.1.0 version currently. It will be easier for us to ask our infrastructure team to upgrade Spark Version. So, wondering by when native support of BlinkDB will be available in Spark.
Also, would be grateful if one can point us to documentation for creating proof of concept around BlinkDB.