spark-operator icon indicating copy to clipboard operation
spark-operator copied to clipboard

Istio-proxy container run only after application container is running. Lifecycle to run command to sleep not working.

Open shinen opened this issue 4 years ago • 6 comments

As title stated, the consequence is all connection/communication in driver are refused and making my spark job terminated.

Due to some reason, I am not able to upgrade istio to 1.7 or k8s to 1.8 to have solution. Currently I am using istio1.5 and k8s 1.17 I tried to specify postStart lifecycle in driver to exec command "/bin/bash", "-c", "sleep 30" but it is not working.

I tried also to check if postStart lifecycle is really running by specify lifecycle to exec command "/bin/bash", "-c", "abc" to see if can trigger a "FailedPostStartHook". But no such event.

May I ask why the postStart lifecycle is not working? And any workaround for such istio-proxy issue?

shinen avatar Jan 21 '21 10:01 shinen

Anyone facing similar issue? Comment/Advice from community is appreciated! Thank you.

shinen avatar Jan 22 '21 00:01 shinen

We haven't yet turned on istio in spark pods (have others with it) but we're looking at scuttle [1] as an interesting workaround for istio sidecars. Also take a look at some success reported in this issue [2] which admittedly is a different version of Spark that yours, but you might be able to adapt some of it.

[1] https://github.com/redboxllc/scuttle

[2] https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/issues/889#issuecomment-758269368

jkleckner avatar Jan 23 '21 19:01 jkleckner

@jkleckner Thanks for recommending Scuttle, I integrated it as a part of Spark's entrypoint.sh and it works as expected: waits till istio-proxy init, proceeds to bootstrap with driver, profit! :)

singh-abhijeet avatar Mar 31 '22 14:03 singh-abhijeet

@jkleckner Thanks for recommending Scuttle, I integrated it as a part of Spark's entrypoint.sh and it works as expected: waits till istio-proxy init, proceeds to bootstrap with driver, profit! :)

@singh-abhijeet are you able to share your changes to entrypoint.sh?

aodj avatar Sep 13 '22 11:09 aodj

Your driver/executor container entrypoint.sh ends with:

exec /usr/bin/tini -s -- "${CMD[@]}"

Just build the image with the scuttle binary and replace that launch with:

exec env ENVOY_ADMIN_API="http://127.0.0.1:15000" ISTIO_QUIT_API="http://127.0.0.1:15020" /usr/bin/scuttle "${CMD[@]}"

(Assuming Istio 1.3+, which ... yeah, should be everyone I should hope.)

AFAICT, I think scuttle handles the same signals as tini, so it should simply replace it.

Cerebus avatar Jun 22 '23 19:06 Cerebus