spark-operator
spark-operator copied to clipboard
Certifictes are generated by operator rather than gencerts.sh
Purpose of this PR
Close #1959
Proposed changes:
-
hack/gencerts.sh
will not be used to generate certificates any more, operator is responsible for generating CA certificate and server certificate - delete
webhook-init-job.yaml
since webhook secret will be created by helm and updated by spark operator - delete
webhook-cleanup-job.yaml
since webhook secret will be deleted by helm - spark operator rbac resources are managed by helm rather than helm hooks since there is no webhook init job anymore
- update Dockerfile
Change Category
Indicate the type of change by marking the applicable boxes:
- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] Feature (non-breaking change which adds functionality)
- [x] Breaking change (fix or feature that could affect existing functionality)
- [x] Documentation update
Rationale
It would be better that the spark operator RBAC resources and webhook secrets are manged by helm rather than helm hooks.
Checklist
Before submitting your PR, please review the following:
- [x] I have conducted a self-review of my own code.
- [x] I have updated documentation accordingly.
- [x] I have added tests that prove my changes are effective or that my feature works.
- [x] Existing unit tests pass locally with my changes.
Additional Notes
[APPROVALNOTIFIER] This PR is NOT APPROVED
This pull-request has been approved by: Once this PR has been reviewed and has the lgtm label, please assign andreyvelich for approval. For more information see the Kubernetes Code Review Process.
The full list of commands accepted by this bot can be found here.
Approvers can indicate their approval by writing /approve
in a comment
Approvers can cancel approval by writing /approve cancel
in a comment
/assign @vara-bonthu
@vara-bonthu Could you review this PR, thanks!
@yuchaoran2011 Could you review this PR, thanks!
@ChenYi015 Could you resolve the merge conflicts?
@yuchaoran2011 Rebase and force-pushed.
@vara-bonthu I had updated related docs and did e2e tests as following:
- Create a kind config
kind-config.yaml
:
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
- role: worker
- role: worker
- Create a kind cluster:
kind create cluster --config kind-config.yaml
- Build docker image and load into kind cluster:
docker build -t docker.io/kubeflow/spark-operator:local .
kind load docker-image docker.io/kubeflow/spark-operator:local
- Install the helm chart with webhook enabled:
helm install spark-operator charts/spark-operator-chart \
--namespace spark-operator \
--create-namespace \
--set image.tag=local \
--set webhook.enable=true \
--set enforceQuotaEnforcement=true \
--set 'sparkJobNamespaces[0]=default'
- Inspect the webhook secret to verity the private keys and certificates are populated correctly:
$ kubectl get secret -n spark-operator -o yaml spark-operator-webhook-certs
apiVersion: v1
data:
ca-cert.pem: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURxakNDQXBLZ0F3SUJBZ0lJUU9zWWwzdDhEaVl3RFFZSktvWklodmNOQVFFTEJRQXdVVEVYTUJVR0ExVUUKQ2hNT2MzQmhjbXN0YjNCbGNtRjBiM0l4TmpBMEJnTlZCQU1UTFhOd1lYSnJMVzl3WlhKaGRHOXlMWGRsWW1odgpiMnN0YzNaakxuTndZWEpyTFc5d1pYSmhkRzl5TG5OMll6QWVGdzB5TkRBMk1EVXdNekk0TVRkYUZ3MHpOREEyCk1EVXdNekk0TVRkYU1GRXhGekFWQmdOVkJBb1REbk53WVhKckxXOXdaWEpoZEc5eU1UWXdOQVlEVlFRREV5MXoKY0dGeWF5MXZjR1Z5WVhSdmNpMTNaV0pvYjI5ckxYTjJZeTV6Y0dGeWF5MXZjR1Z5WVhSdmNpNXpkbU13Z2dFaQpNQTBHQ1NxR1NJYjNEUUVCQVFVQUE0SUJEd0F3Z2dFS0FvSUJBUUM4cit5aUZ5eENFOEx4a0lCaitmL3lUcjBVClVtSmZKU2JINHI3L2V4NU16VFRFVzFVYS84MW42VnhrNTRpZlh1YWxMMDUwb1cyMlFLWndGMnJrWGRNVmlwUTgKRlY4cUlWb214M045MWNUajUvUnlEcmdPTUhhZVVJK3ltT0xteWZxUklSQVFXdjluaWxwUGdCOTIybVZPaE5CcAptQ3UxK1dGTVJReHhtZkw1TUkwcFVJaEROSkdCdHl3SUtUbWREQSs3NkRORS9pMzkrSWNoVHJaWTJkZG1WSnEwCjNLVTIxdS93TXVzYWo4S05oVlZpUlRZbTYxVk5rR0t4YlNIdFprZFlLMlJLbmcxR08wMGNaTktYTDM3SjVjek8KcGRhWjFEekliNElCUDJCdE1Nb0s3WW5pREtxRnhaTVZ6VHJoL2J5aVByNHJyYko4em1Eamd2dytrdXh0QWdNQgpBQUdqZ1lVd2dZSXdEZ1lEVlIwUEFRSC9CQVFEQWdYZ01CMEdBMVVkSlFRV01CUUdDQ3NHQVFVRkJ3TUJCZ2dyCkJnRUZCUWNEQWpBTUJnTlZIUk1CQWY4RUFqQUFNRU1HQTFVZEVRUThNRHFDQ1d4dlkyRnNhRzl6ZElJdGMzQmgKY21zdGIzQmxjbUYwYjNJdGQyVmlhRzl2YXkxemRtTXVjM0JoY21zdGIzQmxjbUYwYjNJdWMzWmpNQTBHQ1NxRwpTSWIzRFFFQkN3VUFBNElCQVFBYThIdWxYSkt2RlBCRFVHeTNpMGtqcklmcGIxdG9sa1FReU16YUJ5MFVWTE92CjVOWkpOcUZOYkRxWFpGV1VZYnorY1FDWUJpYmJiWW9mZTg3Z0Q3Vi9MeUJ3WGxvbXRGQmg4Njl3Yk5SUExEb2gKenJBREFtdkZQSWRtNURHZkRlT1lxdldObEU2S2Z6NUh1bldkNkNKdlovRDRKbG5xeWVKTGJNamg0dVZuWHA2NQprNkJLUGtYSC8zK3pFTEEzNnFLcWFwa1FXb3J5dExnWUNxdGdhRmp4OWFndjlyUnI1ZFQrVmRGYk0ySTdwU1d5CnZCK0Jpcmt3T0xVUk55MWhCQkpHdjM1UFVucVVIN2FqWHF1YytqL1N2NHo5OEp5WmIrb1lqL252dFZyc2Npd2wKYUNvZUV1ZTE4VGxYTklKTEJ2T0pTajEzVFNKSVNvR3ZwZnFEcHdrOQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg==
ca-key.pem: LS0tLS1CRUdJTiBSU0EgUFJJVkFURSBLRVktLS0tLQpNSUlFcEFJQkFBS0NBUUVBNDFBY2l2SGNLQzk5RkJCZDBqdHRTQjdSb3FhVmY1cHM0ZjI2WDhvclQ4Y2dscCtuCjdwckFaeXc1YVFWSXhmSFphNTdlUmMxUVFtVGVTaUtNTE5Nc1h2VWxCWmMyb3Y1M3U3RjN6MGN4TVhsYU5xUEsKSkZZanJpd2lhYlBZdGR3aHA5R3F5RUpPdmdOempzVldBUElBWGkwUW13dnVKeUJOZTMrVFd1djJ0MnhSVmNyNApyYjFtOVFrNHhsL0pIMTI1OTJoQ2hyQTNFNG83em9Zd0cxUWF4RDFDdXNWWG1FcjhoQmowdkxqTDFhY0VOS2poCk9yVE9HQzdUWStrTFlYUU9LdmN4aCtOYmE3bGJGbWN6dHErSkpHU0FyQVpYWGhIUTNkSFNEM09qa3o5V00vRVkKaXhxRTRmRmd0TTh1QWlZTjdtaHF0MmZCTFA4SmpWK1BURWdhRHdJREFRQUJBb0lCQUNFcVhSL0FyaGlHNVQ3NgpMRll5S1gydVVYUGp6a2d4NWRVTFNoZ1R6VUgwa2NLb1JMNUJnZlVMdE15bjRyaE8weVFxcDgrVFp6Um90eTRsCjRFSGlCY1ZOQ3p2SGxrY3R6WlpyREVvSDN4dVMweURKd1FLUU51Q0F1L3lrS3VoTjEvTStXaWFoMWc5UFBac0YKRzhsRGhkNDN3UVorTlI4c1RXSEplVng0dFNTSnVPUTZ6WjRCQnJoMGR5V1hSQVcxZFYxVWV4dzh5RnoxTVFDagpZNGJKY0NKd21sRFFwZTM1NmREZWI5WlBSaVJUQjhadXFNS2FlakVkc1dZRWJhdVMvanhBcThNc0NkWkszMFZmCkdiSlcyT1dzYTJKZzEycnhmL1RsenBPd3JGS2RLZmljWGRwcGVxWlMzbDAwNmcrM3J0Q3VGK2s1aWpDR21nYTIKUHVNaXZNRUNnWUVBN0pzRndYZ0Z6UHhQNUQrazdLR05SWThnaVVZZVRKd0dZb0tyMlZyTEFNWU5DY2gzNzc5bgpwRlN1MUxCY1VHN0xlUDRxWGxhSWV1MVBYN2h4bi91anFYUW40OUc5NmN5TkNCbjBWRUtLS3hhREpoWVlxYitDCnBJeFRWa1JzYWR4L0xMWTRPSXUxQnpiYkJuV3BLV1FLeHh3VFNheGJSbWgwQSttamhuZXVweEVDZ1lFQTlmSVgKUUVwNXZMeUIzakNRaEcvYVZQU0U5N2hvUGRnRCs5WnE2Qjl5RFdtOXVPOXdUQmdUWVZJTk5JbEFyaEQrUFEwWAprTG5KSTBGZGJodmMxN2hCaWltV0s1dDhiWXNnMHRpQ2lIVW1GWHlGN0htbWhhQjJXNGpkd1RPRDRPYTlydTZXCktrd3l3VTBnMURpSE9Pb1lHbXhBT3dmbmZ0WGUyL0k1Z0hlWjd4OENnWUVBMWQ4clRMNlpQN215M2JkSjlUdnkKM3pXSlM0eStSckdpYzlsNlRYYnNtVDVzK3JMaTl5d2xHejRRNnVDZ0VYU1ZLRUZYT3Y4dFR6REQxdHA2bXdwegozZkRKUGYyUmxZejR6cUhuWVdMa1VoNS9YaVlMRlNXdmlkM3VWc1J5Mng0ZE51VmYzSDBzbmVEUUN2N0FjbEdrCkRHY3NhQ1FNUFpDZGpndmJiT2t5VG9FQ2dZRUFxZFAvWmkrSEhHSjJzcnlLTGtrbVZCOThhYW4yb1MyMm9vR08KMUxaU0JSME5HdFNMa0ovWFVnNWNlL2lDcHkrb3Z2TjVZRUJKdVlSN1JYc0w1aEdmZ0EzeldpMUZvRWEvNVpnSAptcjU2QzhBdW9mbm1tTU1TdDJZczZpbnVXTEE4THIwbENCUVJ3QlRJSklMY0xOckl4Z1lWM0MwN0Z3UUxuWWtIClY4UStrVFVDZ1lCWnBvYlBFMGpaY0UvZXVlQzNmcUlTem90NmRGejNpQmNuMjFqZGxBQWNrNUZGWStrNVViaGcKSHcvNFBTRkMza1ZZTDFCemZoS3huMDMrN2ozazhRdE9MRGRLSy9xMm44WUpnUjhoSzQ2eVljODJoWkM0TWk1QQpCM3RkRFZLRHVoNldtUUg4cCtXRkZiZng5RnQ5SWR3TkJndjE0a3pPd2RnS0d6ZVZHdVgrd0E9PQotLS0tLUVORCBSU0EgUFJJVkFURSBLRVktLS0tLQo=
server-cert.pem: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURxakNDQXBLZ0F3SUJBZ0lJUU9zWWwzdDhEaVl3RFFZSktvWklodmNOQVFFTEJRQXdVVEVYTUJVR0ExVUUKQ2hNT2MzQmhjbXN0YjNCbGNtRjBiM0l4TmpBMEJnTlZCQU1UTFhOd1lYSnJMVzl3WlhKaGRHOXlMWGRsWW1odgpiMnN0YzNaakxuTndZWEpyTFc5d1pYSmhkRzl5TG5OMll6QWVGdzB5TkRBMk1EVXdNekk0TVRkYUZ3MHpOREEyCk1EVXdNekk0TVRkYU1GRXhGekFWQmdOVkJBb1REbk53WVhKckxXOXdaWEpoZEc5eU1UWXdOQVlEVlFRREV5MXoKY0dGeWF5MXZjR1Z5WVhSdmNpMTNaV0pvYjI5ckxYTjJZeTV6Y0dGeWF5MXZjR1Z5WVhSdmNpNXpkbU13Z2dFaQpNQTBHQ1NxR1NJYjNEUUVCQVFVQUE0SUJEd0F3Z2dFS0FvSUJBUUM4cit5aUZ5eENFOEx4a0lCaitmL3lUcjBVClVtSmZKU2JINHI3L2V4NU16VFRFVzFVYS84MW42VnhrNTRpZlh1YWxMMDUwb1cyMlFLWndGMnJrWGRNVmlwUTgKRlY4cUlWb214M045MWNUajUvUnlEcmdPTUhhZVVJK3ltT0xteWZxUklSQVFXdjluaWxwUGdCOTIybVZPaE5CcAptQ3UxK1dGTVJReHhtZkw1TUkwcFVJaEROSkdCdHl3SUtUbWREQSs3NkRORS9pMzkrSWNoVHJaWTJkZG1WSnEwCjNLVTIxdS93TXVzYWo4S05oVlZpUlRZbTYxVk5rR0t4YlNIdFprZFlLMlJLbmcxR08wMGNaTktYTDM3SjVjek8KcGRhWjFEekliNElCUDJCdE1Nb0s3WW5pREtxRnhaTVZ6VHJoL2J5aVByNHJyYko4em1Eamd2dytrdXh0QWdNQgpBQUdqZ1lVd2dZSXdEZ1lEVlIwUEFRSC9CQVFEQWdYZ01CMEdBMVVkSlFRV01CUUdDQ3NHQVFVRkJ3TUJCZ2dyCkJnRUZCUWNEQWpBTUJnTlZIUk1CQWY4RUFqQUFNRU1HQTFVZEVRUThNRHFDQ1d4dlkyRnNhRzl6ZElJdGMzQmgKY21zdGIzQmxjbUYwYjNJdGQyVmlhRzl2YXkxemRtTXVjM0JoY21zdGIzQmxjbUYwYjNJdWMzWmpNQTBHQ1NxRwpTSWIzRFFFQkN3VUFBNElCQVFBYThIdWxYSkt2RlBCRFVHeTNpMGtqcklmcGIxdG9sa1FReU16YUJ5MFVWTE92CjVOWkpOcUZOYkRxWFpGV1VZYnorY1FDWUJpYmJiWW9mZTg3Z0Q3Vi9MeUJ3WGxvbXRGQmg4Njl3Yk5SUExEb2gKenJBREFtdkZQSWRtNURHZkRlT1lxdldObEU2S2Z6NUh1bldkNkNKdlovRDRKbG5xeWVKTGJNamg0dVZuWHA2NQprNkJLUGtYSC8zK3pFTEEzNnFLcWFwa1FXb3J5dExnWUNxdGdhRmp4OWFndjlyUnI1ZFQrVmRGYk0ySTdwU1d5CnZCK0Jpcmt3T0xVUk55MWhCQkpHdjM1UFVucVVIN2FqWHF1YytqL1N2NHo5OEp5WmIrb1lqL252dFZyc2Npd2wKYUNvZUV1ZTE4VGxYTklKTEJ2T0pTajEzVFNKSVNvR3ZwZnFEcHdrOQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg==
server-key.pem: LS0tLS1CRUdJTiBSU0EgUFJJVkFURSBLRVktLS0tLQpNSUlFcFFJQkFBS0NBUUVBdksvc29oY3NRaFBDOFpDQVkvbi84azY5RkZKaVh5VW14K0srLzNzZVRNMDB4RnRWCkd2L05aK2xjWk9lSW4xN21wUzlPZEtGdHRrQ21jQmRxNUYzVEZZcVVQQlZmS2lGYUpzZHpmZFhFNCtmMGNnNjQKRGpCMm5sQ1BzcGppNXNuNmtTRVFFRnIvWjRwYVQ0QWZkdHBsVG9UUWFaZ3J0ZmxoVEVVTWNabnkrVENOS1ZDSQpRelNSZ2Jjc0NDazVuUXdQdStnelJQNHQvZmlISVU2MldOblhabFNhdE55bE50YnY4RExyR28vQ2pZVlZZa1UyCkp1dFZUWkJpc1cwaDdXWkhXQ3RrU3A0TlJqdE5IR1RTbHk5K3llWE16cVhXbWRROHlHK0NBVDlnYlRES0N1MkoKNGd5cWhjV1RGYzA2NGYyOG9qNitLNjJ5Zk01ZzQ0TDhQcExzYlFJREFRQUJBb0lCQVFDNEFqaWF1azZIQWc2UwoxWURmL3VZRHY1WFZRNko3ZHhlaXh4WE13SnlEK1hzRUlxMlVidkk1Ni9JVzFWVC9WdVZISWlNNHlsVGI3NkJnCm4vVzJUMm1URUZvUFhpZzRSZDVOQXlVMkNrckFsMnhqN3NhL3o3TmVJT0tDSVdibCt3TklsUjI5VllETjBMYlIKNFBqT1I1MlVQU0dpV0t3SUF2TklGZTVVdXZXZzNIQ0xDTmlRTEMzM0VBOVo4MmNwSVoxdkUrOG1rbGtteXdOcwpQczJvUGQ5eVNTZUpUalU4clM2MDBxVjRCTTdPcTcwYjBMKzdyRjQ4OTMwaTFaMis4MjcxanFMK2FUWGI2elR1CjRuQ0pIczNyRUpsZ0tqQUl5UDlaMVF3MmphcEsxUlFvc3V6V2I0aUcwcjFDOWwwSzJBaW1PdDhYTkMvcUtSdk0KQm85Z0VRSmhBb0dCQU1MdlFIdk03MmxITDVtTWp2eE9FOFJoVk9maHJUQWNtcmMvRDIxU1JTZ3VhNFA1K1NjTwo1aFJBT0NWWUpleWo0aUhSVTI1OU00aUV1aGowZXdhZ3U2a29xN3g3QXZ3TjNxdEZIVHUvWnJsVFhpdE9WaEt6Clo3OVZjNXZHMm80SjUvRUppNm5VamRQeXZ5Mm1xWHp2aE92MmJJeVJQcTBUbnZFWWYwM21JUXZEQW9HQkFQZkwKcWxPUVRZOWx3OUpHUlNoMk1GQW9XMXBIUlpYTDhjQW5NblhEVnBXVkg5QVFJb2VBVnpteVdHMllkV2lBRXZ6dQp6MkhueFZqUHpMc0JLQkVHN3hLSWx6emNGZ2tIUkIxMVl6YXFGNUpRTGROYjFNZUNXSmNTdEx5L3FGWFVvSDRDCldPbXBGWHRCWFJkUjVxSkp6Yi83c1Z2OVBwTDB3OTk2WWRBWDZCUVBBb0dCQUpWdWtORVdqYVQzdy82Q2FJM3oKVUdYbmN3MzZ5eWVwbGRUSmk0cnpXVDV2TDA1Ump2U3BFQ2tQL2JwcTgwK1BaZWNrcno5d3pOTm5ZNzJEbE5mRQoyWGJZVGFaRDZrck1XeGlSOTlINGJNZStwOTZzdzRDOGROaVFxZm9ObXpidFV4ZE1pUHJjalFpZitud0ZXY0lEClhyTUFDY0JNQzI3a0xxQ0ZkZm1DWTJ5L0FvR0JBT3ZUTFllWHB1alkvZE5Kd3htdDJXNy82V2p5dVh2RmU0N1cKL3dQcVlxVzdKV3FiWUhFNnFFaWx2ZGlYcHUxTUxrWC9kT2lGYm1DR2F4NXlERktnR2JpMnU5QlUyTGZBN1lkbgpwNE5udjBVay8yZk9WcU9GSHBDd1ljZmNVdlZVaFdWSEVKMVhxTFVEMFBlWG4zcEY2UVZVSVVnZHJJYXBZUngzCldVMTA0dzdyQW9HQVhCT21BaU9HRStTaHo0SG5aZmo4ZEVKdnNXcjN2OTh2OFpaK3VyYmQyRUFVNUwyeENHdGUKYzlCM1BLZXFOTFhzWkZDaER3aHg4WnE0eml1UGE1cjVTUzI0aFpNNEFjWGlpVkJVY1Y4YWhqNlN6TnRxUTlKUAp2QW1BdHNDM25ZYVhPOFIyS1FWcjBhT3JJemFGbE9qZW8xYTRtVjFwMVBtemY1Szc1M0lVc2NVPQotLS0tLUVORCBSU0EgUFJJVkFURSBLRVktLS0tLQo=
kind: Secret
metadata:
annotations:
meta.helm.sh/release-name: spark-operator
meta.helm.sh/release-namespace: spark-operator
creationTimestamp: "2024-06-05T03:28:16Z"
labels:
app.kubernetes.io/instance: spark-operator
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: spark-operator
app.kubernetes.io/version: v1beta2-1.6.0-3.5.0
helm.sh/chart: spark-operator-1.4.0
name: spark-operator-webhook-certs
namespace: spark-operator
resourceVersion: "1389"
uid: b71143f7-d117-4080-bfb5-018ce50ca766
type: Opaque
- Submit an SparkApplication:
# spark-pi.yaml
apiVersion: sparkoperator.k8s.io/v1beta2
kind: SparkApplication
metadata:
name: spark-pi
namespace: default
spec:
type: Scala
mode: cluster
image: spark:3.5.1
imagePullPolicy: Always
mainClass: org.apache.spark.examples.SparkPi
mainApplicationFile: local:///opt/spark/examples/jars/spark-examples_2.12-3.5.1.jar
sparkVersion: "3.5.1"
restartPolicy:
type: Never
volumes:
- name: test-volume
hostPath:
path: /tmp
type: Directory
driver:
volumeMounts:
- name: test-volume
mountPath: /tmp
serviceAccount: spark-operator-spark
labels:
version: "3.5.1"
executor:
instances: 1
volumeMounts:
- name: test-volume
mountPath: /tmp
labels:
version: "3.5.1"
kubectl apply -f spark-pi.yaml
- Inspect whether the volume is mounted successfully by webhook:
$ kubectl get pod spark-pi-driver -o json | jq '.spec.containers[0].volumeMounts'
[
{
"mountPath": "/var/data/spark-9e473bfd-e7ef-47ca-ba5f-b9840591a8fb",
"name": "spark-local-dir-1"
},
{
"mountPath": "/opt/spark/conf",
"name": "spark-conf-volume-driver"
},
{
"mountPath": "/var/run/secrets/kubernetes.io/serviceaccount",
"name": "kube-api-access-ck8lk",
"readOnly": true
},
{
"mountPath": "/tmp",
"name": "test-volume"
}
]
- Inspect the spark operator logs to verify webhook server works:
$ kubectl logs -n spark-operator spark-operator-79bb9ffdd7-pkwt5 | grep webhook.go
I0605 03:37:57.589167 13 webhook.go:366] Updated webhook secret spark-operator/spark-operator-webhook-certs
I0605 03:37:57.589416 13 webhook.go:218] Starting the Spark admission webhook server
I0605 03:37:57.590365 13 webhook.go:484] Updating existing MutatingWebhookConfiguration for the Spark pod admission webhook
I0605 03:38:07.577914 13 webhook.go:244] Serving admission request
I0605 03:38:07.580560 13 webhook.go:616] Pod spark-pi-driver in namespace default is subject to mutation
I0605 03:38:10.184861 13 webhook.go:244] Serving admission request
I0605 03:38:10.185471 13 webhook.go:616] Pod spark-pi-7ae9138fe679bede-exec-1 in namespace default is subject to mutation
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: vara-bonthu, yuchaoran2011
The full list of commands accepted by this bot can be found here.
The pull request process is described here
- ~~OWNERS~~ [vara-bonthu,yuchaoran2011]
Approvers can indicate their approval by writing /approve
in a comment
Approvers can cancel approval by writing /approve cancel
in a comment
Hey @ChenYi015 , can you clarify why this is a breaking change? I'm working on upgrading my team's spark-operator repo and while I can go forwards to the new helm chart that includes this change, I can't revert back without getting an error about service accounts not existing (even though we configured them). I'm wondering if its due to the order that rbac resources get created. Before they were created with helm hooks and now they're created by the chart installation process. What do you think?
Hey @ChenYi015 , can you clarify why this is a breaking change? I'm working on upgrading my team's spark-operator repo and while I can go forwards to the new helm chart that includes this change, I can't revert back without getting an error about service accounts not existing (even though we configured them). I'm wondering if its due to the order that rbac resources get created. Before they were created with helm hooks and now they're created by the chart installation process. What do you think?
@colinsteidtmann Before, the rbac resourecs for operator were created by helm pre-install and pre-upgrade hooks, but not pre-rollback hook. Thus, when you try to rollback the chart, rbac resources will not be created.
Hey @ChenYi015 , can you clarify why this is a breaking change? I'm working on upgrading my team's spark-operator repo and while I can go forwards to the new helm chart that includes this change, I can't revert back without getting an error about service accounts not existing (even though we configured them). I'm wondering if its due to the order that rbac resources get created. Before they were created with helm hooks and now they're created by the chart installation process. What do you think?
@colinsteidtmann Before, the rbac resourecs for operator were created by helm pre-install and pre-upgrade hooks, but not pre-rollback hook. Thus, when you try to rollback the chart, rbac resources will not be created.
Thanks, we're actually using Terraform's helm provider to manage our helm releases, so our "rollback" is effectively changing helm chart versions and running terraform apply
. I'm having trouble figuring out which hooks get triggered and when, I thought terraform apply
would always trigger either the upgrade or install hook, but maybe not. Do you have any ideas on how we can rollback spark operator smoothly? Is it possible to create the rbac resources manually?
Hey @ChenYi015 , can you clarify why this is a breaking change? I'm working on upgrading my team's spark-operator repo and while I can go forwards to the new helm chart that includes this change, I can't revert back without getting an error about service accounts not existing (even though we configured them). I'm wondering if its due to the order that rbac resources get created. Before they were created with helm hooks and now they're created by the chart installation process. What do you think?
@colinsteidtmann Before, the rbac resourecs for operator were created by helm pre-install and pre-upgrade hooks, but not pre-rollback hook. Thus, when you try to rollback the chart, rbac resources will not be created.
Thanks, we're actually using Terraform's helm provider to manage our helm releases, so our "rollback" is effectively changing helm chart versions and running
terraform apply
. I'm having trouble figuring out which hooks get triggered and when, I thoughtterraform apply
would always trigger either the upgrade or install hook, but maybe not. Do you have any ideas on how we can rollback spark operator smoothly? Is it possible to create the rbac resources manually?
@colinsteidtmann When you run teffaform apply
to "rollback" the cart, the pre-install/pre-upgrade hook will be triggered and the rbac resources will be created. But during the upgrading process, helm will compare the differnece between the two versions, and the rbac resources show up in the newer version but not the older version, then helm will delete them, causing serviceaccount not found error. You can create the rbac resources manually as follows:
# Get hooks manifest
helm get hooks -n spark-operator spark-operator > hooks.yaml
Then edit hooks.yaml
and change the namespace of serviceaccount to relase namespace. Then create the hook resources:
kubectl apply -f hooks.yaml