etcd-cluster-operator
etcd-cluster-operator copied to clipboard
Backup/Restore proxy and agents
Implement 'proxy' service to control upload and download of backups during the backup and restore process.
See the design document for details (TODO commit design document into git repo 😅)
- [ ] Implement a proxy service.
- [x] Add the entry point under
cmd/proxy, the Dockerfile atbuild/package/proxy.Dockerfile, and basic infrastructure changes to enable the proxy to be built and published in tests. - [x] Add the gRPC APIs for backup and restore, at first as “stub” implementations.
- [ ] Implement the backup upload code.
- [x] Implement the restore download code.
- [x] Implement credential handling.
- [x] Add recommended deployment YAML, and instructions to the installation documentation.
- [ ] Implement a metrics endpoint for the proxy service, and expose something useful.
- [x] Add the entry point under
- [x] Rebuild restore branch on top of proxy work.
- [x] Change restore agent implementation to use proxy, and republish pull request.
- [x] Build restore agent Docker image.
- [x] Change
EtcdRestoreto only specify an object URL and not credentials. - [ ] Make the operator create a
ServiceAccountin the client’sNamespaceto run the restorePodwith. - [x] Update restore documentation and examples.
- [ ] Build backup agent
- [ ] Add entry point under
cmd/backup-agent, add it to theDockerfile, and other build infra changes to build it in tests. - [ ] Implement backup call in agent, calling out to proxy’s API for upload.
- [ ] Change
EtcdBackupto not specify a destination or credentials. - [ ] Remove old backup code, and change the controller for
EtcdBackupto launch the agent instead. - [ ] Make the operator create a
ServiceAccountin the client’s namespace to run the backupPodwith. - [ ] Update backup documentation and examples.
- [ ] Add entry point under
- [ ] End-to-end testing
- [x] Deploy MinIO in kind as part of the testing context.
- [ ] Write a full end-to-end test that:
- Deploys an
EtcdCluster - Writes a key in that etcd cluster to value 1
- Takes a backup to MinIO using the S3 API
- Changes the key to value 2
- Delete the
EtcdClusterandPersistentVolumeClaims - Create a an
EtcdRestore - Wait for the etcd cluster to come back
- Verify that the contents of the key is value 1
- Deploys an
- [ ] Miscellaneous Cleanup
- [ ] Commit design document into the repository as Markdown
- [ ] Update documentation to match approach.