bdit_data-sources
bdit_data-sources copied to clipboard
Eco counter dagify
What this pull request accomplishes:
- Builds on Nate's ecocounter code for a daily airflow DAG.
- Automatically inserts new sites/channels (marked as
validated = False
) and notifies in slack. - Updates all the ecocounter DDL in github to match current definitions in bigdata (plus add
validated
col to sites and flows)
Issue(s) this solves:
- Closes #912
What, in particular, needs to reviewed:
- [ ] Think through next steps on how to use
validated
column.
What needs to be done by a sysadmin after this PR is merged
- Replace all references to
gwolofs
schema withecocounter
. Also add validated columns to ecocounter and backfill data since the last ecocounter pull. - Update data_scripts path
@Nate-Wessel addressed your comments + added a readme section for the DAG. Once you're OK with how we're handling the unvalidated flows, I can merge and adjust all the scripts to point to ecocounter
schema instead of gwolofs.
@chmnata can you quickly look over the DAG? It is working as expected: https://trans-bdit.intra.prod-toronto.ca/airflow/dags/ecocounter_pull/grid. Nate has reviewed the rest.
@Nate-Wessel , @chmnata
Ready for re-review! The DAG is now inserting into ecocounter.counts_unfiltered
and there is a new view ecocounter.counts
which has only validated flows. Also added partitioning to ecocounter.counts_unfiltered
on Natalie's request.