pdr-backend
pdr-backend copied to clipboard
[DuckDB][ETL] Scope down Queries/ETL + Increase base quality
Describe the bug
Like I had proposed before... we're still having issues in our queries/ETL that are causing a bunch of distractions. These are queries that extend beyond our core focus and we can increase in complexity after the basics are working well (example: slots).
Our release candidate only needs 4: predictions, truevals, payouts + bronze_predictions
Once our pipeline is working end-to-end reliably, then, increase in complexity and add those queries back in. As you can see from the image above, by reducing the problem we can see that our bronze table min_datestr is not behaving as expected.
Isolate our target, get the whole thing working correctly end-to-end, then expand.
DoD
- Scope down queries
- Increase quality of ETL
Tasks
- [x] Disable slots and subscriptions raw_queries
- [x] Disable bronze_slots table
- [x] Fix issues across all 4 tables such that they are working as expected, end-to-end
[Fix ETL - Move Temp Table to Live + Drop Temp Tables] There were some bugs/bad assumptions with how the "Temp Table" logic works inside etl.py... I believe i have fixed that now, and updated the tests to match this.
[Table Registry] Due to the new NamedTables, I believe we can get rid of the TableRegistry object.