PyHive icon indicating copy to clipboard operation
PyHive copied to clipboard

Unable to set http header with thrift version 0.15.0

Open RamakrishnaChilaka opened this issue 3 years ago • 2 comments

RamakrishnaChilaka avatar Oct 05 '21 17:10 RamakrishnaChilaka

We also found this using using PyHive 0.6.4 and thrift 0.15.0.

To replicate the issue you can create a hive session using a HTTP transport with the custom header that HiveServer2 requires for authorization:

from pyhive import hive
import base64
import thrift.transport.THttpClient

def thrift_http_transport():
    transport = thrift.transport.THttpClient.THttpClient(uri_or_host='http://localhost:12001/cliservice')

    auth_credentials = '{}:{}'.format('-', '-').encode('UTF-8')
    auth_credentials_base64 = base64.standard_b64encode(auth_credentials).decode('UTF-8')
    transport.setCustomHeaders(
        {
            'Authorization': 'Basic {}'.format(auth_credentials_base64),  # HiveServer2 BASIC auth
        }
    )
    return transport

conn = hive.connect(thrift_transport=thrift_http_transport())
cursor = conn.cursor()
cursor.execute("""SELECT SUM(1) from model.dim_date""")
data = cursor.fetchall()
cursor.close()
conn.close()
print(data)

With thrift 0.13.0 this code executes fine, but with 0.15.0 the following error/traceback is seen:

Traceback (most recent call last):
  File "test.py", line 17, in <module>
    conn = hive.connect(thrift_transport=thrift_http_transport())
  File "/versions/3.8.7/envs/lib/python3.8/site-packages/pyhive/hive.py", line 104, in connect
    return Connection(*args, **kwargs)
  File "/versions/3.8.7/envs/lib/python3.8/site-packages/pyhive/hive.py", line 249, in __init__
    response = self._client.OpenSession(open_session_req)
  File "/versions/3.8.7/envs/lib/python3.8/site-packages/TCLIService/TCLIService.py", line 186, in OpenSession
    self.send_OpenSession(req)
  File "/versions/3.8.7/envs/lib/python3.8/site-packages/TCLIService/TCLIService.py", line 195, in send_OpenSession
    self._oprot.trans.flush()
  File "/versions/3.8.7/envs/lib/python3.8/site-packages/thrift/transport/THttpClient.py", line 191, in flush
    self.__http.putheader('Cookie', self.headers['Set-Cookie'])
  File "/versions/3.8.7/lib/python3.8/http/client.py", line 1217, in putheader
    raise CannotSendHeader()
http.client.CannotSendHeader

This appears to be related to changes made for https://issues.apache.org/jira/browse/THRIFT-5165

Related PR: https://github.com/apache/thrift/pull/2086 Related Commit: https://github.com/apache/thrift/commit/69642f389a06f5ba1b374de52c6b0e29892035d8

gthomas-slack avatar Nov 04 '21 17:11 gthomas-slack

There is an open issue on Jira which is related to this problem.

csordasmarton avatar Dec 14 '21 12:12 csordasmarton