Brad Miro

Results 26 comments of Brad Miro

Thanks for your response @mn-mikke. By following your suggestion, I do now see `./sparkling-water/dist/build/dist/sparkling-water-3.30.0.6-1-2.4.zip`. However, unfortunately `hc = H2OContext.getOrCreate()` still generates the same error as above.

Thanks @jakubhava, I tried your suggestion, both with `-Pspark=2.4` and without, and unfortunately the error persists. And @mn-mikke is correct in that Spark 2.4.5 / Scala 2.12 is tied to...

Hey @mn-mikke, apologies here. I just tried @jakubhava's suggestion which unfortunately didn't work. I tried with release 3.30.1.1-1 and Spark 2.4.6 which is now available on Dataproc 1.5 (up from...

Hey @mn-mikke, we couldn't end up finding a sensible solution on our side so we now just advise users to skip this version of Dataproc, which should be fine. Thanks...

I've also tried with MPI but am running into issues with the canonical job hanging.

Just seems to hang, I'm not able to see any output. I think that also may be another part of my problem; I am not able to see any job...

Do you happen to know where Horovod stores its logs? I can't seem to find them, and they may provide more insight.

Hey @OscarDPan , Dataproc does not come bundled with its own build of MPI, but thanks for the flag suggestion and I'll try running it on my end. Hey @tgaddair...

@tgaddair I tried that with Gloo and it didn't work. Is redirecting stdout/stderr with Gloo supported? I was digging into your source code and saw it may not be: https://github.com/horovod/horovod/blob/a9dea74abc1f0b8e81cd2b6dd9fe81e2c4244e39/horovod/spark/runner.py#L152...

Thanks @OscarDPan for the MPI tip, this worked for me. MPI correctly shows stdout and stderr. As for redirect, the following worked for me and correctly wrote the output to...