Elly.jl icon indicating copy to clipboard operation
Elly.jl copied to clipboard

Unable to run Elly on Mapr Yarn cluster

Open SavageReader opened this issue 9 years ago • 5 comments

When trying to start the Julia cluster manager on a MapR Yarn cluster I get the following error: ERROR: Elly.HadoopRpcException(2,"DIGEST-MD5: digest response format violation. Mismatched URI: default/; expecting: null/default") in recv_rpc_message at /home/inikolae/.julia/v0.4/Elly/src/rpc.jl:256 in sasl_auth at /home/inikolae/.julia/v0.4/Elly/src/sasl.jl:156 in conditional_sasl_auth at /home/inikolae/.julia/v0.4/Elly/src/sasl.jl:105 in connect at /home/inikolae/.julia/v0.4/Elly/src/rpc.jl:193 in send_rpc_message at /home/inikolae/.julia/v0.4/Elly/src/rpc.jl:228 in call_method at /home/inikolae/.julia/v0.4/Elly/src/rpc.jl:279 [inlined code] from /home/inikolae/.julia/v0.4/Elly/src/api_yarn_appmaster.jl:60 in register at /home/inikolae/.julia/v0.4/Elly/src/api_yarn_appmaster.jl:103 in submit at /home/inikolae/.julia/v0.4/Elly/src/api_yarn_appmaster.jl:91 in call at /home/inikolae/.julia/v0.4/Elly/src/cluster_manager.jl:28

The cluster functions well otherwise and the cluster manager is given the right address and port numbers.

SavageReader avatar Jan 15 '16 16:01 SavageReader

Possibly a difference in authentication protocol that's not handled. I don't have a MapR installation readily available. Will set one up locally to check this out. Thanks for raising the issue.

tanmaykm avatar Jan 16 '16 01:01 tanmaykm

I can confirm this to be happening on a local setup of mapr sandbox.

Changing the digest-uri to "null/default" as expected by mapr avoids this problem. But I'm not sure why mapr is expecting this particular value.

Elly constructs the digest-uri as protocol/serverid. This seems correct as per rfc2831.

Is there any way we can know how the mapr server constructs this field?

tanmaykm avatar Jan 16 '16 22:01 tanmaykm

Thanks for looking into it. I guess two possible ways would be to check the MapR source code (I don't know how much work that would be) or to create a bug report at MapR, since you say that you followed the protocol RFC.

SavageReader avatar Jan 18 '16 12:01 SavageReader

Hello guys,

I'm currently having the same kind of error when I try to execute a Spark application from a remote machine to a MapR cluster (the remote is not part of the cluster).

I would like to change the digest MD5 URI value but I can't find where it is set up...

Could you please help me :)

Thanks !

rvelfre avatar May 12 '16 12:05 rvelfre

@HERVEKIRK it is in the digmd5_respond method in file sasl.jl here https://github.com/JuliaParallel/Elly.jl/blob/726d72398bd7df0c7a5ce47f4652de69bdb0c8aa/src/sasl.jl#L69

tanmaykm avatar May 12 '16 13:05 tanmaykm