pdr-backend
                                
                                 pdr-backend copied to clipboard
                                
                                    pdr-backend copied to clipboard
                            
                            
                            
                        [Lake][DuckDB] Accuracy app issues and improvements
Issues:
- the current gql fetches slots data from subgraph even if it has a "Pending" status, which means trueval is null and stakes 0. Due to the way we calculate the timetamp for the last data to fetch, the last slots will always have a Pending status. If we keep fetching new data each 5m that leads to all the new data to have null trueval and 0 stakes which means after a time all the accuracy values are going to be 0. Possible fix: fetch only slots with status = "Paying"
Improvements:
- etl.update() fetches the data for the entire ETL, we could just fetch the raw data and skip the bronze tables and other tables that are going to be added to the ETL, or even better just fetch the data for the slots table. For this we could use gql.update() function and modify it to be able to receive as param the tables to fetch the data for.
the current gql fetches slots data from subgraph even if it has a "Pending" status, which means trueval is null and stakes 0. Due to the way we calculate the timetamp for the last data to fetch, the last slots will always have a Pending status. If we keep fetching new data each 5m that leads to all the new data to have null trueval and 0 stakes which means after a time all the accuracy values are going to be 0. Possible fix: fetch only slots with status = "Paying"
This is the correct behavior. We need to incorporate the "update queries" in order to properly address this. Please, do not try to change this, but instead to incorporate the right update procedure.
etl.update() fetches the data for the entire ETL, we could just fetch the raw data and skip the bronze tables and other tables that are going to be added to the ETL, or even better just fetch the data for the slots table. For this we could use gql.update() function and modify it to be able to receive as param the tables to fetch the data for.
No, ETL only fetches the new data and rebuilds new rows for bronze and other tables.
skip the bronze tables and other tables that are going to be added to the ETL,
This is not the right pattern
[My Feedback] There are 2 ways to do this:
- The Old Way - When the API is hit, fetch from subgraph, calculate the accuracies, return the answer
- The New Way - Update the lake + tables.... when the API is hit, fetch from the lake, calculate the accuracies, return the answer
[Problem] The new way requires [slot tables] + [handling update events] such that the slots table is updated and goes from "Pending" to "Paying"
Before doing this to the slots table, let's please implement this with the predictions table... such that when a new "Payouts Event" shows up, an existing "Predictions" record is updated w/ it's payout.
Priorities have mostly shifted away from pdr-backend. So closing less-critical issues.