pydrill
pydrill copied to clipboard
zookeeper support via initialization inside pydrill
Hi Wojtek,
I was just checking out your pydrill module and it looks great -- I'd love to use it in a project I'm working on. One question though, I didn't see any documentation about connecting to a Zookeeper quorum rather than a specific drillbit. Is this currently supported?
Thanks, Dan
Hello!
Currently pyDrill doesn’t support zookeeper, it requires you to first use zookeeper bindings [1] to determine which bits are running and connect to one of them. State is shared across all bits with zookeeper, so any change related to settings will take affect.
I can add support for zookeeper so that before query i would ask ip of bits connected to quorum.
Here is example how to determine ip’s of bits. import zc.zk zk = zc.zk.ZooKeeper('127.0.0.1:2181') zk.get_children('/drill’) # [u'sys.options', u'running', u'sys.storage_plugins', u'drillbit1’] zk.get_children('/drill/drillbit1’) # [u'faa0c8a3-b569-4280-bf04-53f4b76c93e4’, u'aacaf088-d72a-4b7f-ae34-f15fad2cddef’]
PYDRILL_ZOOKEEPER = ['127.0.0.1:2181’, ‚127.0.0.2:2181’] I think it could be supported by env variable or parameter used to initialize pyDrill.
Please share your ideas so that i can enhance pyDrill to support your needs.
Thanks for your reply, the zc code you've provided is very helpful.
I think zookeeper connectivity would be a great addition to pydrill, as connecting to a single node isn't particularly suitable for a production environment. Perhaps allowing the pydrill.client.PyDrill class to take keyword arguments similar to the JDBC/ODBC drivers, something like (ConnectionType='ZooKeeper', ZKQuorum='Server1:Port1,Server2:Port2', ZKClusterID= '<Cluster Name>') would be nice.