spark-operator
spark-operator copied to clipboard
Istio-proxy container run only after application container is running. Lifecycle to run command to sleep not working.
As title stated, the consequence is all connection/communication in driver are refused and making my spark job terminated.
Due to some reason, I am not able to upgrade istio to 1.7 or k8s to 1.8 to have solution. Currently I am using istio1.5 and k8s 1.17 I tried to specify postStart lifecycle in driver to exec command "/bin/bash", "-c", "sleep 30" but it is not working.
I tried also to check if postStart lifecycle is really running by specify lifecycle to exec command "/bin/bash", "-c", "abc" to see if can trigger a "FailedPostStartHook". But no such event.
May I ask why the postStart lifecycle is not working? And any workaround for such istio-proxy issue?
Anyone facing similar issue? Comment/Advice from community is appreciated! Thank you.
We haven't yet turned on istio in spark pods (have others with it) but we're looking at scuttle [1] as an interesting workaround for istio sidecars. Also take a look at some success reported in this issue [2] which admittedly is a different version of Spark that yours, but you might be able to adapt some of it.
[1] https://github.com/redboxllc/scuttle
[2] https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/issues/889#issuecomment-758269368
@jkleckner Thanks for recommending Scuttle, I integrated it as a part of Spark's entrypoint.sh and it works as expected: waits till istio-proxy init, proceeds to bootstrap with driver, profit! :)
@jkleckner Thanks for recommending Scuttle, I integrated it as a part of Spark's entrypoint.sh and it works as expected: waits till istio-proxy init, proceeds to bootstrap with driver, profit! :)
@singh-abhijeet are you able to share your changes to entrypoint.sh?
Your driver/executor container entrypoint.sh ends with:
exec /usr/bin/tini -s -- "${CMD[@]}"
Just build the image with the scuttle binary and replace that launch with:
exec env ENVOY_ADMIN_API="http://127.0.0.1:15000" ISTIO_QUIT_API="http://127.0.0.1:15020" /usr/bin/scuttle "${CMD[@]}"
(Assuming Istio 1.3+, which ... yeah, should be everyone I should hope.)
AFAICT, I think scuttle handles the same signals as tini, so it should simply replace it.