PyHive icon indicating copy to clipboard operation
PyHive copied to clipboard

Cloud config connection timeout?

Open qiangchow opened this issue 8 years ago • 9 comments

Hi,

when use hive.connect, could config timeout? cursor = hive.connect(host='xxx', port=xxx, database=xxx, auth='KERBEROS', kerberos_service_name=xxx).cursor() cursor.execute('SELECT * FROM xxx')

I didn't see the timeout parameter,thanks

`class Connection(object): """Wraps a Thrift session"""

def __init__(self, host=None, port=None, username=None, database='default', auth=None,
             configuration=None, kerberos_service_name=None, password=None,
             thrift_transport=None):
    """Connect to HiveServer2

    :param host: What host HiveServer2 runs on
    :param port: What port HiveServer2 runs on. Defaults to 10000.
    :param auth: The value of hive.server2.authentication used by HiveServer2.
        Defaults to ``NONE``.
    :param configuration: A dictionary of Hive settings (functionally same as the `set` command)
    :param kerberos_service_name: Use with auth='KERBEROS' only
    :param password: Use with auth='LDAP' only
    :param thrift_transport: A ``TTransportBase`` for custom advanced usage.
        Incompatible with host, port, auth, kerberos_service_name, and password.`

Traceback (most recent call last): File "/Users/xxx/Documents/dev/venv/lib/python2.7/site-packages/thrift/transport/TSocket.py", line 104, in open handle.connect(sockaddr) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 228, in meth return getattr(self._sock,name)(*args) error: [Errno 60] Operation timed out

qiangchow avatar Sep 29 '17 02:09 qiangchow

same question. any samples of the configuration parameter?

marklitle avatar May 08 '18 09:05 marklitle

Can anyone detail how to pass timeout to connection? Is it a configuration dictionary element?

Adamage avatar May 24 '18 07:05 Adamage

+1

rcmgleite avatar Sep 05 '18 17:09 rcmgleite

+1

niyanchun avatar Sep 06 '18 01:09 niyanchun

+1

Alovez avatar Oct 22 '18 09:10 Alovez

Sadly it seems that PyHive doesn't provide this. You'll see that the socket is created here

socket = thrift.transport.TSocket.TSocket(host, port)

One may then call the following TSocket method to set the timeout:

socket.setTimeout(timeout_ms)

In my case, I am using PLAIN authentication, so I just implemented a little function like so:

import sasl
from thrift_sasl import TSaslClientTransport
from thrift.transport.TSocket import TSocket


def create_hive_plain_transport(host, port, username, password, timeout=60):
    socket = TSocket(host, port)
    socket.setTimeout(timeout * 1000)

    sasl_auth = 'PLAIN'

    def sasl_factory():
        sasl_client = sasl.Client()
        sasl_client.setAttr('host', host)
        sasl_client.setAttr('username', username)
        sasl_client.setAttr('password', password)
        sasl_client.init()
        return sasl_client

    return TSaslClientTransport(sasl_factory, sasl_auth, socket)

And now, when running connect, I use this function to create the thrift transport:

hive.connect(
    thrift_transport=create_hive_plain_transport(
        host='bla',
        port=10000,
        username='me',
        password='password',
        timeout=120
    ),
    database='bla'
)

See the following code in PyHive for inspiration (as I did) :smile:

I noticed this approach from the pyhs2 Connection constructor.

Hope this helps someone :smile: Fotis

fgimian avatar Jan 15 '19 23:01 fgimian

Any plans to add a timeout param to hive.connect ?

wilberh avatar Jul 21 '20 14:07 wilberh

Have you any new insights about this little config?

AmineBenami avatar Nov 03 '20 08:11 AmineBenami

I have try to change and add the timeout argument and value, but it failed....

darrkz avatar Dec 22 '21 03:12 darrkz