docker_datalake icon indicating copy to clipboard operation
docker_datalake copied to clipboard

SGE Data integration

Open vincentnam opened this issue 3 years ago • 0 comments

Implementation of SGE data integration for Ms SQL server 2017

( Linked to "In progress task" project board : SGE Data integration tasks)

TODO :

  • [x] Set a MsSQL server with Adminer (through Docker container)

  • [x] Create scripts for dump (initial and differentials dumps) It is impossible to import the differentials when it already has been done. The problem may be solved by : changing the differential backup method (create a new file every day for each new differential), change scripts (??? It is a possibility but don't know why), change the MSSql Server version (by changing scripts, error "Files corrupted" happend, it could be a problem version : seems unexpected because the same file has been successfully imported the first time) -> solution for data availability quickly : recreate the database (and the container) every day with 2 cron tasks.

  • [ ] Create Airflow pipeline

  • [ ] Create a daily differential import (Cron task on neodata2 server)

Daily differential import :

  • [x] Create cron task
  • [ ] ~~Set daily run on Airflow pipeline~~ Daily differential import with Airflow processing for import in DB

Basic web GUI for SGE Requesting (https://github.com/vincentnam/SGERequesting) :

  • [x] Web GUI
  • [x] Driver MSSql Server for communication
  • [x] Export data as a .xlsx file

vincentnam avatar Jan 11 '21 09:01 vincentnam