spark icon indicating copy to clipboard operation
spark copied to clipboard

[FEATURE REQUEST]: Dotnet Backend IPAddress

Open indy-3rdman opened this issue 5 years ago • 19 comments

Is your feature request related to a problem? Please describe. Currently it is possible to specify a custom port for the Dotnet Backend. However it will only listen on the loopback address.

Describe the solution you'd like Would it be possible to assign a specify the IP-address as well (e.g. 0.0.0.0 for all IPv4 addresses)?

Additional context This would make things easier, if using .NET for Apache Spark with a docker container (e.g. https://hub.docker.com/r/3rdman/dotnet-spark) for example.

indy-3rdman avatar Feb 06 '20 20:02 indy-3rdman

@indy-3rdman thanks for the suggestion. Note that DotnetRunner on the JVM side spawns the .NET application process and starts communicating, thus it's using localhost.

I didn't fully get your scenario, but are you trying to run both C# app and spark in the container? Or do you plan to have spark running as a debug mode in the container and run the C# app outside?

Can you try to update https://github.com/dotnet/spark/blob/ee95ca25aad9d7fd1daeefaa768faa8b49ca1149/src/scala/microsoft-spark-2.4.x/src/main/scala/org/apache/spark/api/dotnet/DotnetBackend.scala#L60 https://github.com/dotnet/spark/blob/46e4effc06c8dd52d9c7cb99f6c9a0377a3b047b/src/csharp/Microsoft.Spark/Network/DefaultSocketWrapper.cs#L32

and see it would work for your scenario, and let me know?

imback82 avatar Feb 07 '20 04:02 imback82

@imback82 thanks a lot for your reply. The idea is to run .Net for Apache Spark in the container and debug the C# app from the outside, as described in this post https://3rdman.de/2019/10/debug-net-for-apache-spark-with-visual-studio-and-docker.

indy-3rdman avatar Feb 08 '20 07:02 indy-3rdman

Nice blog post. So, you already have this running with the container, so how would specifying the IP address help further?

By the way, were you able to get your scenario working by changing above files? (You can hard-code the IP and see if it works).

imback82 avatar Feb 11 '20 01:02 imback82

@indy-3rdman Any update on this?

imback82 avatar Mar 07 '20 04:03 imback82

@imback82 I'm really sorry for the delay. However, it is still on my todo list.

indy-3rdman avatar Mar 07 '20 09:03 indy-3rdman

@imback82 Just built a test image with the two changes and that seems to work fine:

spark/src/csharp/Microsoft.Spark/Network/DefaultSocketWrapper.cs Line 32

_innerSocket.Bind(new IPEndPoint(IPAddress.Any, 0));

spark/src/scala/microsoft-spark-2.4.x/src/main/scala/org/apache/spark/api/dotnet/DotnetBackend.scala Line 60

channelFuture = bootstrap.bind(new InetSocketAddress("0.0.0.0", portNumber))

Not sure where this leaves us, though, as this probably should be configurable, shouldn't it?

indy-3rdman avatar Mar 09 '20 17:03 indy-3rdman

For work, I'd like to implement a similar scenario to integration test our Spark jobs. To simplify the development setup, we're running all our components (DB servers, Redis caches etc.) in a local Docker compose setup. I'd like to integrate the Dotnet Backend into this setup as well so that we can easily run the tests in the developer's IDEs as well as in our Azure pipelines without having to go through the hassle of setting up Spark (and all the correct Java versions etc.) locally. But this is not possible since the backend only listens on the loopback address.

I documented this setup in a repository: https://github.com/moredatapls/dotnet-spark-docker

It also works for me when I implement the changes suggested by @imback82 above.

moredatapls avatar May 25 '20 19:05 moredatapls

For work, I'd like to implement a similar scenario to integration test our Spark jobs. To simplify the development setup, we're running all our components (DB servers, Redis caches etc.) in a local Docker compose setup. I'd like to integrate the Dotnet Backend into this setup as well so that we can easily run the tests in the developer's IDEs as well as in our Azure pipelines without having to go through the hassle of setting up Spark (and all the correct Java versions etc.) locally. But this is not possible since the backend only listens on the loopback address.

I documented this setup in a repository: https://github.com/moredatapls/dotnet-spark-docker

It also works for me when I implement the changes suggested by @imback82 above.

@moredatapls, for now I am using port redirection via socat as a workaround for my docker image . So maybe that would be an option for you as well, until this gets implemented.

indy-3rdman avatar May 26 '20 17:05 indy-3rdman

@indy-3rdman / @moredatapls So, what's the recommended way of doing this for the docker container? Would socat be enough or allowing users to override binding address/port will be a better solution?

imback82 avatar May 27 '20 01:05 imback82

@indy-3rdman / @moredatapls So, what's the recommended way of doing this for the docker container? Would socat be enough or allowing users to override binding address/port will be a better solution?

@imback82, I think that allowing the user to specify the binding address (including Any/0.0.0.0 to listen on all available addresses) would be much nicer than the current socat hack.

indy-3rdman avatar May 27 '20 16:05 indy-3rdman

@Niharikadutta Do you want to work on this?

imback82 avatar May 27 '20 17:05 imback82

Yes I'll work on this

Niharikadutta avatar May 27 '20 17:05 Niharikadutta

Thanks. I am assigning this to @Niharikadutta.

imback82 avatar May 27 '20 17:05 imback82

@indy-3rdman I agree, letting the user specify the address seems like the much nicer solution. Happy to see the progress here :)

moredatapls avatar May 27 '20 19:05 moredatapls

@indy-3rdman/@moredatapls Thanks for your inputs. Please feel free to take a look at #537 if you want. Thanks!

imback82 avatar Jun 12 '20 02:06 imback82

Hello. Any updates due to this feature request and #537 PR? We have some restrictions for Environment's setup(dev and test) and this feature will be very helpful.

artsiomtserashkovich avatar Nov 19 '20 12:11 artsiomtserashkovich

@Niharikadutta Hello! Any progress here?

shamahov avatar Aug 02 '21 11:08 shamahov

Hi @shamahov , I am looking into this. We need to figure out the security implications, I'll keep you updated, thanks.

Niharikadutta avatar Aug 05 '21 20:08 Niharikadutta

We need this as well. It's super valuable if you are writing tests in Visual Studio on Windows because you can just bind to the WSL2 address with dynamic port forwarding, however that only works if the address is bound to 0.0.0.0.

Macromullet avatar Mar 31 '22 03:03 Macromullet