Elly.jl icon indicating copy to clipboard operation
Elly.jl copied to clipboard

Beginners question: Cannot connect to cluster

Open Para7etamol opened this issue 1 year ago • 0 comments

Hi there,

I'm running a hadoop cluster (v3.2.1) using https://github.com/big-data-europe/docker-hadoop

I can run a Java program to test the existence of an file on the hdfs of the cluster.

But I cannot do this using Elly.jl:

using Elly

hdfs = HDFSClient("localhost", 9000,  UserGroupInformation())

exists(hdfs, "/")

yields

namenode         | java.lang.NullPointerException
namenode         | 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.logAuditEvent(FSNamesystem.java:405)
namenode         | 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.logAuditEvent(FSNamesystem.java:377)
namenode         | 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.logAuditEvent(FSNamesystem.java:371)
namenode         | 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:3197)
namenode         | 	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:1173)
namenode         | 	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:979)
namenode         | 	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
namenode         | 	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:528)
namenode         | 	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)
namenode         | 	at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:999)
namenode         | 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2917)

I'm using Julia 1.10.4 and Elly 0.5.1

To access hdfs from JAVA I had to add some dependencies (hadoop-common, hadoop-hdfs, hadoop-hdfs-client) and copied core-site.xml and hdfs-site.xml from the container into the resources dir of the java application. Optionally I added lib/native from the containers /opt/hadoop dir to LD_LIBRARY_PATH to prevent the warning WARN NativeCodeLoader:60 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

I suppose I have to perform at least some of the above "Java"-steps when using Elly.jl ... but which and how?

Please help, I would LOVE to use hadoop from Julia.

Greetings Para

Para7etamol avatar Nov 17 '24 16:11 Para7etamol