bdit_data-sources
bdit_data-sources copied to clipboard
Organize the daily WYS DAG
The current structure of pull_wys
DAG has too many tasks done within pull_wys
task which is implemented in wys_api.py
which contains many long sql queries. It could be better to split the task pull_wys
into smaller tasks and convert these sql queries into PostgreSQL functions. That should help identifying bugs faster and testing new modifications easier.
When we reorganize this DAG, rather than pulling all the raw data in one task (currently around 2 hours to pull 700 locations), it would be cool to dynamically generate a task for each location pulled by the previous signs
API call. Con: that's a lot of tasks.