spark-operator icon indicating copy to clipboard operation
spark-operator copied to clipboard

Connecting Spark Shell to standalone SparkCluster with Openshift/Minishift has different instructions

Open reynoldsm88 opened this issue 5 years ago • 10 comments

Overview

After deploying the default my-spark-cluster I found that the process for connecting a local spark-shell to the cluster was different when using Openshift/Minishift. The steps I followed might not be the best way to do it, so I'm looking for feedback there.

Description:

I'm using Openshift/Minishift and I couldn't use the exact commands documented in the README as minikube service was not available to get the Spark master URL. I played around with it until I eventually came up with the solution documented below. What I am curious about is how come after I expose the route why can't I just use that URL, why do I still have to use the port mapping from the svc?

Steps to reproduce:

oc apply -f manifest/operator.yaml
oc apply -f examples/cluster.yaml
oc expose rc my-spark-cluster-m --type=NodePort
oc expose svc my-spark-cluster-m

oc get svc my-spark-cluster-m
NAME                 TYPE       CLUSTER-IP       EXTERNAL-IP   PORT(S)                         AGE
my-spark-cluster-m   NodePort   172.30.100.112   <none>        7077:32610/TCP,8080:30338/TCP   4m

oc get route my-spark-cluster-m
NAME                 HOST/PORT                                        PATH      SERVICES             PORT      TERMINATION   WILDCARD
my-spark-cluster-m   my-spark-cluster-m-default.192.168.64.4.nip.io             my-spark-cluster-m   port-1                  None


# Spark Master URL is <spark.cluster.m.route.host>:<spark.cluster.m.svc.port>
bin/spark-shell --master spark://my-spark-cluster-m-default.192.168.64.4.nip.io:32610
# SUCCESS!!!

reynoldsm88 avatar Mar 07 '19 23:03 reynoldsm88

I also hit this issue. My understanding is that routes in OpenShift are http(s) or websocket only. If one wants to expose some non-http protocol. Ingress resource is pretty similar to route, so again http only, however there are workarounds.

What people are doing for debug purposes is:

kubectl/oc port-forward my-spark-cluster-m-fbqql 7077:7077

..and then using localhost:7077 as the Spark master

However, I had also some troubles with this on my Fedora.

What can also work for quick debug (but is quite hacky) is using

kubectl exec sparky-cluster-m-fbqql -ti -- /bin/sh
/opt/spark/bin/spark-shell

The cleanest way is to deploy another container to the same namespace, either with spark-shell, jupyter notebook, script that calls spark-submit or any other form of spark driver application and access the spark master using the service and dns in K8s.

jkremser avatar Mar 08 '19 08:03 jkremser

in your example, when you run

oc expose svc my-spark-cluster-m

it just exposed the route, which is http based => useless

however, the command before

oc expose rc my-spark-cluster-m --type=NodePort

will work on oc cluster up environment (I haven't tried with minishift)

this was the output for you:

oc get svc my-spark-cluster-m
NAME                 TYPE       CLUSTER-IP       EXTERNAL-IP   PORT(S)                         AGE
my-spark-cluster-m   NodePort   172.30.100.112   <none>        7077:32610/TCP,8080:30338/TCP   4m

I think, you should be able to check if spark is listening

telnet localhost 32610  # this works on oc cluster up env.; w/ minishift, there could be some ip instead of localhost

jkremser avatar Mar 08 '19 08:03 jkremser

@Jiri-Kremser thanks for the reply. I didn't make the connection that Openshift routes were http(s) only. That makes a little bit more sense now. I did try what you suggested by connecting to the IP + PORT directly, however, I think there's probably more complications because I'm running Minishift on a Mac host. Because everything is running on a VM I've always had interesting networking issues when dealing with Docker/k8's/Openshift on a Mac

reynoldsm88 avatar Mar 08 '19 13:03 reynoldsm88

@Jiri-Kremser just tidying up my GH for the weekend. Do you want me to throw something in the README about the Openshift specific instructions?

reynoldsm88 avatar Mar 10 '19 23:03 reynoldsm88

I was actually thinking about introducing new doc file, something like (docs/)?getting_started(_openshift)?.md and putting it here (including the last 3 sections). I like the readme.md as short as possible, it should attract the users.. I hate listing all the edge cases and issues there. @elmiko wdyt?

jkremser avatar Mar 11 '19 15:03 jkremser

i think creating a new doc structure in a docs directory is a fine idea. i think we could start to restructure the readme and create a few docs, this would help keep the readme clean and provide a clear path to find specific info.

elmiko avatar Mar 11 '19 15:03 elmiko

Hey sorry got busy with my day job :)...

I can look into doing something like this sometime this week once I clear out my backlog a little bit.

reynoldsm88 avatar Mar 12 '19 16:03 reynoldsm88

Hey @Jiri-Kremser, do you want me to just create a docs folder with some OCP specific instructions? Or do we want to go full asciidoc like the main site?

Also, I think some of the peculiarity of what I needed to do to connect might be because I'm running Minishift/Openshift on a Mac OSX. I remember the plain Docker was weird to run on Mac because it was actually running in a VM and had a whole bunch of weird networking issues.

reynoldsm88 avatar Mar 17 '19 22:03 reynoldsm88

what about docs/troubleshooting.md or docs/getting_started_openshift.md?

jkremser avatar Mar 18 '19 10:03 jkremser

Perfect thanks. I am going to verify on a Linux VM beforehand to make see if that the issue of me having to both expose the route AND use the port number is a Mac only issue. I was not able to do the port-forwarding solution that you suggested which is kind of mysterious to me. I suspect it's the nature of how Docker/Openshift runs on Macs.

reynoldsm88 avatar Mar 18 '19 15:03 reynoldsm88