Vincent-Nam DANG
Vincent-Nam DANG
How to make an automatic implementation of workflow from standard definition Need for : - security - code testing - authentication and restriction (ex : deny data access on disks...
# Datalake enhancement : modular backend and optimization Block storage: overlay Native object storage Block mode overlay ? Is it possible to change the storage backend (Openstack Swift) in the...
Add reverse proxy before every access to service in the architecture.
What will be done : - [ ] Code cleaning - [ ] Apache Airflow dag - [ ] Python scripts - [ ] Comments - [ ] Apache Airflow...
# Data management : Data management is composed of 2 parts : batch data and stream data. The stream part is not currently well implemented in the architecture. Tools are...
Airflow is one of the most important service in the datalake architecture. There is a lot of work to do in. Airflow handles all the workflows / pipelines for data...
For automation of the architecture deployment, one of the biggest work is too deploy Openstack services (here Swift and Keystone). A good way to deploy services that are maintainable, scalable...
# Dropzone file - [x] Create the component for a file drop zone - [x] Connect to the raw data area : MongoDB metadata insertion - [x] Connect to the...
Monitoring of any network or system is a crucial service for sustainable, maintenable and evolutive architecture. As it is a complex architecture, several level of monitoring are needed : -...
The central authentication system is the main security service for data security. The tool choosed is Openstack Keystone for several reason : - API are available in Python (and other...