docker_datalake
docker_datalake copied to clipboard
Containerization, automatic deployment and Kubernetes
'Automatic deployment :
First step will be to create docker container for each service to deploy and create Ansible playbooks to deploy automatically on VM.
Second step will be to make this deployment possible on Kubernetes cluster.
- [x] MongoDB container
- [ ] Openstack Swift container (container exists, need to fit it to the install)
- [x] Apache Airflow container
- [ ] Processed Data zone container (Apache Airflow and Apache Spark)
- [x] Apache Airflow (from Apache docker hub)
- [ ] Apache Spark
- [ ] Service zone container (Web GUI and REST api (Flask))
- [ ] Security, authentication and monitoring container (Openstack Keystone, Kerberos (?), ... ?)
- [ ] Ansible Playbooks
- [x] Base playbooks
- [ ] Raw data zone playbooks
- [x] Metadata management zone playbooks
- [ ] Process zone playbooks
- [ ] Processed data zone playbooks
- [ ] Service zone playbooks
- [ ] Security, authentication and monitoring zone playbooks
- [ ] Ansible playbook launcher
Later :
- [ ] Import on Kubernetes cluster
- [ ] Apache Spark container
A merge request is associated to this issue ; see https://gitlab.irit.fr/datalake/docker_datalake/-/merge_requests/67