docker_datalake
docker_datalake copied to clipboard
SGE Data integration
Implementation of SGE data integration for Ms SQL server 2017
( Linked to "In progress task" project board : SGE Data integration tasks)
TODO :
-
[x] Set a MsSQL server with Adminer (through Docker container)
-
[x] Create scripts for dump (initial and differentials dumps) It is impossible to import the differentials when it already has been done. The problem may be solved by : changing the differential backup method (create a new file every day for each new differential), change scripts (??? It is a possibility but don't know why), change the MSSql Server version (by changing scripts, error "Files corrupted" happend, it could be a problem version : seems unexpected because the same file has been successfully imported the first time) -> solution for data availability quickly : recreate the database (and the container) every day with 2 cron tasks.
-
[ ] Create Airflow pipeline
-
[ ] Create a daily differential import (Cron task on neodata2 server)
Daily differential import :
- [x] Create cron task
- [ ] ~~Set daily run on Airflow pipeline~~ Daily differential import with Airflow processing for import in DB
Basic web GUI for SGE Requesting (https://github.com/vincentnam/SGERequesting) :
- [x] Web GUI
- [x] Driver MSSql Server for communication
- [x] Export data as a .xlsx file