pdr-backend icon indicating copy to clipboard operation
pdr-backend copied to clipboard

[DuckDB][ETL] Scope down Queries/ETL + Increase base quality

Open idiom-bytes opened this issue 1 year ago • 1 comments

Describe the bug Like I had proposed before... we're still having issues in our queries/ETL that are causing a bunch of distractions. These are queries that extend beyond our core focus and we can increase in complexity after the basics are working well (example: slots). image

Our release candidate only needs 4: predictions, truevals, payouts + bronze_predictions Screenshot from 2024-05-21 08-15-24

Once our pipeline is working end-to-end reliably, then, increase in complexity and add those queries back in. As you can see from the image above, by reducing the problem we can see that our bronze table min_datestr is not behaving as expected.

Isolate our target, get the whole thing working correctly end-to-end, then expand.

DoD

  • Scope down queries
  • Increase quality of ETL

Tasks

  • [x] Disable slots and subscriptions raw_queries
  • [x] Disable bronze_slots table
  • [x] Fix issues across all 4 tables such that they are working as expected, end-to-end

idiom-bytes avatar May 21 '24 15:05 idiom-bytes

[Fix ETL - Move Temp Table to Live + Drop Temp Tables] There were some bugs/bad assumptions with how the "Temp Table" logic works inside etl.py... I believe i have fixed that now, and updated the tests to match this.

[Table Registry] Due to the new NamedTables, I believe we can get rid of the TableRegistry object.

idiom-bytes avatar May 22 '24 19:05 idiom-bytes