arrow icon indicating copy to clipboard operation
arrow copied to clipboard

[Python] `pyarrow.fs.HadoopFileSystem` throws OSError: Unable to load libhdfs

Open matthiasgomolka opened this issue 1 year ago • 2 comments

Describe the bug, including details regarding any error messages, version, and platform.

I'm trying to create an HDFS Connection via pyarrow.fs.HadoopFileSystem, but unfortunately I get an error:

from pyarrow.fs import HadoopFileSystem
hdfs = HadoopFileSystem(
    host="localhost",
    port=8001,
)

OSError: Unable to load libhdfs: Das angegebene Modul wurde nicht gefunden.

From https://arrow.apache.org/docs/python/filesystems.html#hadoop-distributed-file-system-hdfs I understand that libhdfs.so should be located in %HADOOP_HOME%lib/native/, which is the case. I also set the CLASSPATH environment variable to %HADOOP_HOME%/bin/hadoop.

What am I missing?

I use pyarrow==15.0.0.

Component(s)

Python

matthiasgomolka avatar Mar 01 '24 12:03 matthiasgomolka

still watching this, :")

oendnsk675 avatar Oct 18 '24 11:10 oendnsk675

@oendnsk675 Could you share what you did and what's happened with it?

kou avatar Oct 18 '24 13:10 kou