PyHive icon indicating copy to clipboard operation
PyHive copied to clipboard

PyHive and Transport mode - HTTP

Open gregorysuarez opened this issue 9 years ago • 13 comments

Our server is configured with hive.server2.transport.mode set to HTTP. When switching to binary , everything seems to work perfectly. Is there a way to enable PyHive to work with HTTP transport mode ?

gregorysuarez avatar Jul 08 '16 22:07 gregorysuarez

Sorry, there is no code for HTTP mode.

jingw avatar Jul 08 '16 23:07 jingw

Any timeline/priority when this will be available

zscalerspark avatar May 04 '17 20:05 zscalerspark

I'm very interested into this functionality too as I must start using the HTTP mode for Apache Knox. Is it something that can be planned ?

jrevillard avatar Mar 21 '18 13:03 jrevillard

Me too, I must use Apache Knox to authenticate with Hive. Is this currently possible with PyHive?

jbreija avatar Apr 25 '18 17:04 jbreija

Likewise, I'd also be interested in this functionality!

Polar-is avatar May 25 '18 10:05 Polar-is

Any updates on HTTP support?

rohit-menon avatar Oct 26 '18 18:10 rohit-menon

Please add HTTP support. As i am trying to use the code piece in prod environment where other teams are also connecting through JDBC, pyspark; i am not able to set it binary if that makes other connections not work.

Neelotpaul avatar Apr 22 '19 10:04 Neelotpaul

+1 .. Can you please add Http Support for Pyhive

rashmigulhane avatar May 07 '19 07:05 rashmigulhane

+1

modeyang avatar Feb 20 '20 05:02 modeyang

Hi all, just created a PR to add support for Thrift connections over HTTP transport. You can follow its progress here: https://github.com/dropbox/PyHive/pull/325

joaopedroantonio avatar Apr 04 '20 15:04 joaopedroantonio

Our server is configured with hive.server2.transport.mode set to HTTP. When switching to binary , everything seems to work perfectly. Is there a way to enable PyHive to work with HTTP transport mode ?

So as an end user vs someone configuring the Hive datastore, this is likely not possible via my connection string.

e.g.

con = hive.Connection(host = hive_host, port = 10000, username = hive_username, auth='NOSASL') Docstring for Hive Connection:

Init signature:
hive.Connection(
    host=None,
    port=None,
    username=None,
    database='default',
    auth=None,
    configuration=None,
    kerberos_service_name=None,
    password=None,
    thrift_transport=None,
)
Docstring:      Wraps a Thrift session
Init docstring:
Connect to HiveServer2

:param host: What host HiveServer2 runs on
:param port: What port HiveServer2 runs on. Defaults to 10000.
:param auth: The value of hive.server2.authentication used by HiveServer2.
    Defaults to ``NONE``.
:param configuration: A dictionary of Hive settings (functionally same as the `set` command)
:param kerberos_service_name: Use with auth='KERBEROS' only
:param password: Use with auth='LDAP' or auth='CUSTOM' only
:param thrift_transport: A ``TTransportBase`` for custom advanced usage.
    Incompatible with host, port, auth, kerberos_service_name, and password.

The way to support LDAP and GSSAPI is originated from cloudera/Impyla:
https://github.com/cloudera/impyla/blob/255b07ed973d47a3395214ed92d35ec0615ebf62
/impala/_thrift_api.py#L152-L160
File:           ~/miniconda3/envs/spark_2_4_4/lib/python3.8/site-packages/pyhive/hive.py
Type:           type
Subclasses:   

pauldevos avatar Jul 28 '20 14:07 pauldevos

@pauldevos , Thanks a lot for the above snippet. Can we log all the HTTP headers in spark thrift server ?

RamakrishnaChilaka avatar Aug 04 '21 10:08 RamakrishnaChilaka

Hi, are there any updates on this?

danjampro avatar Nov 28 '22 20:11 danjampro