spec.driver.env/envFrom is not working even when webhook is enabled
I deployed a spark operator with manifest/spark-operator-with-webhook.yaml, with -enable-webhook=true. However using env or envFrom doesn't inject the environment variables to driver and executor.
What's the problem?
Following is how I define env in my spark application yml. I have tested deploying each of them separately.
Only envVars(which will be deprecated) works.
driver:
env:
- name: ENV_TWO
value: hello
- name: ENV_TWO
valueFrom:
secretKeyRef:
name: secretenv
key: TEST
envFrom:
- secretRef:
name: secretenv
envVars:
TEST_ENVVARS: test
env is not working for me as well. The spark application config shows it: driver: env:
- name: ENV1 value: VAL1
But the env variables are not created inside the pods.
Anyone can help?
I faced the same problem, any solution?
hi,all I have not encountered this kind of problem, but I am happy to help you troubleshoot what happened。
create secret env kubectl create -f spark-secret-env.yaml,the content is as follows:
apiVersion: v1
kind: Secret
metadata:
name: spark-secret-env
namespace: szww
type: Opaque
data:
password: cGFzc3dvcmQK
username: YWRtaW4=
Now,i create sparkapplication with the following yaml
apiVersion: "sparkoperator.k8s.io/v1beta2"
kind: SparkApplication
metadata:
name: spark-pi-test-env
namespace: szww
spec:
type: Scala
mode: cluster
image: "gcr.io/spark-operator/spark:v3.0.0"
imagePullPolicy: Always
mainClass: org.apache.spark.examples.SparkPi
mainApplicationFile: "local:///opt/spark/examples/jars/spark-examples_2.12-3.0.0.jar"
sparkVersion: "3.0.0"
arguments:
- "10000"
restartPolicy:
type: Never
volumes:
- name: "test-volume"
hostPath:
path: "/tmp"
type: Directory
driver:
cores: 1
coreLimit: "1200m"
memory: "512m"
env:
- name: "ENV1"
value: "VAL1"
- name: "USER"
valueFrom:
secretKeyRef:
name: spark-secret-env
key: username
envFrom:
- secretRef:
name: spark-secret-env
labels:
version: 3.0.0
serviceAccount: spark
volumeMounts:
- name: "test-volume"
mountPath: "/tmp"
executor:
cores: 1
instances: 1
memory: "512m"
env:
- name: "ENV1"
value: "VAL1"
- name: "USER"
valueFrom:
secretKeyRef:
name: spark-secret-env
key: username
envFrom:
- secretRef:
name: spark-secret-env
labels:
version: 3.0.0
volumeMounts:
- name: "test-volume"
mountPath: "/tmp"
Then I run the following command to enter the container, in order to see if the environment variables were injected.
# kubectl -n szww exec -it spark-pi-test-env-driver bash
I entered the container and was able to get the env
185@spark-pi-test-env-driver:~/work-dir$ echo $ENV1
VAL1
185@spark-pi-test-env-driver:~/work-dir$ echo $USER
admin
185@spark-pi-test-env-driver:~/work-dir$ echo $username
admin
185@spark-pi-test-env-driver:~/work-dir$ echo $password
password
@kz33 Tried your approach and still enviroment variables are not making their way into their containers.
@nooshin-mirzadeh @shinen @sakshi-bansal
Is this still a valid issue with you?
I tested on my end and even if webhook is enabled, the environment variables is not working
This is now working.. Just make sure you update your latest helm chart.
Hm, I am still hitting this on the latest helm chart.
I am having the same problem with chart version 1.0.7, running on EKS 1.18. As workaround I set it directly to Sparkconfig
spec:
sparkConf:
"spark.kubernetes.driverEnv.[EnvironmentVariableName]": "value"
env , and envFrom were not working with chart 1.0.7, operator v1beta2-1.2.0-3.0.0, kubernetes version v1.19.7
seems like issue with with kubernetes version v1.19 which is built using Go 1.15
installed
helm upgrade --install --version 1.0.7 sparkoperator spark-operator/spark-operator \
--namespace spark-operator --set sparkJobNamespace=spark-apps,webhook.enable=true \
--set image.tag=v1beta2-1.2.0-3.0.0
from api server logs v1.19.7
W0323 17:54:05.005825 1 dispatcher.go:170] Failed calling webhook, failing open webhook.sparkoperator.k8s.io: failed calling webhook "webhook.sparkoperator.k8s.io": Post "https://sparkoperator-spark-operator-webhook.spark-operator.svc:443/webhook?timeout=30s": x509: certificate relies on legacy Common Name field, use SANs or temporarily enable Common Name matching with GODEBUG=x509ignoreCN=0
E0323 17:54:05.005857 1 dispatcher.go:171] failed calling webhook "webhook.sparkoperator.k8s.io": Post "https://sparkoperator-spark-operator-webhook.spark-operator.svc:443/webhook?timeout=30s": x509: certificate relies on legacy Common Name field, use SANs or temporarily enable Common Name matching with GODEBUG=x509ignoreCN=0
I0323 17:54:08.140788 1 client.go:360] parsed scheme: "passthrough"
I0323 17:54:08.140836 1 passthrough.go:48] ccResolverWrapper: sending update to cc: {[{https://10.157.149.99:2379 <nil> 0 <nil>}] <nil> <nil>}
I0323 17:54:08.140846 1 clientconn.go:948] ClientConn switching balancer to "pick_first"
W0323 17:54:13.251479 1 dispatcher.go:170] Failed calling webhook, failing open webhook.sparkoperator.k8s.io: failed calling webhook "webhook.sparkoperator.k8s.io": Post "https://sparkoperator-spark-operator-webhook.spark-operator.svc:443/webhook?timeout=30s": x509: certificate relies on legacy Common Name field, use SANs or temporarily enable Common Name matching with GODEBUG=x509ignoreCN=0
E0323 17:54:13.251512 1 dispatcher.go:171] failed calling webhook "webhook.sparkoperator.k8s.io": Post "https://sparkoperator-spark-operator-webhook.spark-operator.svc:443/webhook?timeout=30s": x509: certificate relies on legacy Common Name field, use SANs or temporarily enable Common Name matching with GODEBUG=x509ignoreCN=0
operator v1beta2-1.2.2-3.0.0 seems to have fixed above error with kubernetes version v1.19.7 through https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/pull/1027
eks 1.18 also not working
I spent many hours trying to troubleshoot this issue today. I posted my findings here: https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/issues/1229#issuecomment-827896078
TLDR; While I could not get the env or envFrom methods to work, I was able to get unblocked (for now) using envSecretKeyRefs.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
This issue has been automatically closed because it has not had recent activity. Please comment "/reopen" to reopen it.