Cloud config connection timeout?
Hi,
when use hive.connect, could config timeout?
cursor = hive.connect(host='xxx', port=xxx, database=xxx, auth='KERBEROS', kerberos_service_name=xxx).cursor() cursor.execute('SELECT * FROM xxx')
I didn't see the timeout parameter,thanks
`class Connection(object): """Wraps a Thrift session"""
def __init__(self, host=None, port=None, username=None, database='default', auth=None,
configuration=None, kerberos_service_name=None, password=None,
thrift_transport=None):
"""Connect to HiveServer2
:param host: What host HiveServer2 runs on
:param port: What port HiveServer2 runs on. Defaults to 10000.
:param auth: The value of hive.server2.authentication used by HiveServer2.
Defaults to ``NONE``.
:param configuration: A dictionary of Hive settings (functionally same as the `set` command)
:param kerberos_service_name: Use with auth='KERBEROS' only
:param password: Use with auth='LDAP' only
:param thrift_transport: A ``TTransportBase`` for custom advanced usage.
Incompatible with host, port, auth, kerberos_service_name, and password.`
Traceback (most recent call last): File "/Users/xxx/Documents/dev/venv/lib/python2.7/site-packages/thrift/transport/TSocket.py", line 104, in open handle.connect(sockaddr) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 228, in meth return getattr(self._sock,name)(*args) error: [Errno 60] Operation timed out
same question. any samples of the configuration parameter?
Can anyone detail how to pass timeout to connection? Is it a configuration dictionary element?
+1
+1
+1
Sadly it seems that PyHive doesn't provide this. You'll see that the socket is created here
socket = thrift.transport.TSocket.TSocket(host, port)
One may then call the following TSocket method to set the timeout:
socket.setTimeout(timeout_ms)
In my case, I am using PLAIN authentication, so I just implemented a little function like so:
import sasl
from thrift_sasl import TSaslClientTransport
from thrift.transport.TSocket import TSocket
def create_hive_plain_transport(host, port, username, password, timeout=60):
socket = TSocket(host, port)
socket.setTimeout(timeout * 1000)
sasl_auth = 'PLAIN'
def sasl_factory():
sasl_client = sasl.Client()
sasl_client.setAttr('host', host)
sasl_client.setAttr('username', username)
sasl_client.setAttr('password', password)
sasl_client.init()
return sasl_client
return TSaslClientTransport(sasl_factory, sasl_auth, socket)
And now, when running connect, I use this function to create the thrift transport:
hive.connect(
thrift_transport=create_hive_plain_transport(
host='bla',
port=10000,
username='me',
password='password',
timeout=120
),
database='bla'
)
See the following code in PyHive for inspiration (as I did) :smile:
I noticed this approach from the pyhs2 Connection constructor.
Hope this helps someone :smile: Fotis
Any plans to add a timeout param to hive.connect ?
Have you any new insights about this little config?
I have try to change and add the timeout argument and value, but it failed....