pdr-backend icon indicating copy to clipboard operation
pdr-backend copied to clipboard

[Lake][DuckDB] Accuracy app issues and improvements

Open KatunaNorbert opened this issue 1 year ago • 2 comments

Issues:

  • the current gql fetches slots data from subgraph even if it has a "Pending" status, which means trueval is null and stakes 0. Due to the way we calculate the timetamp for the last data to fetch, the last slots will always have a Pending status. If we keep fetching new data each 5m that leads to all the new data to have null trueval and 0 stakes which means after a time all the accuracy values are going to be 0. Possible fix: fetch only slots with status = "Paying"

Improvements:

  • etl.update() fetches the data for the entire ETL, we could just fetch the raw data and skip the bronze tables and other tables that are going to be added to the ETL, or even better just fetch the data for the slots table. For this we could use gql.update() function and modify it to be able to receive as param the tables to fetch the data for.

KatunaNorbert avatar May 16 '24 12:05 KatunaNorbert

the current gql fetches slots data from subgraph even if it has a "Pending" status, which means trueval is null and stakes 0. Due to the way we calculate the timetamp for the last data to fetch, the last slots will always have a Pending status. If we keep fetching new data each 5m that leads to all the new data to have null trueval and 0 stakes which means after a time all the accuracy values are going to be 0. Possible fix: fetch only slots with status = "Paying"

This is the correct behavior. We need to incorporate the "update queries" in order to properly address this. Please, do not try to change this, but instead to incorporate the right update procedure.

etl.update() fetches the data for the entire ETL, we could just fetch the raw data and skip the bronze tables and other tables that are going to be added to the ETL, or even better just fetch the data for the slots table. For this we could use gql.update() function and modify it to be able to receive as param the tables to fetch the data for.

No, ETL only fetches the new data and rebuilds new rows for bronze and other tables.

skip the bronze tables and other tables that are going to be added to the ETL,

This is not the right pattern

idiom-bytes avatar May 27 '24 14:05 idiom-bytes

[My Feedback] There are 2 ways to do this:

  1. The Old Way - When the API is hit, fetch from subgraph, calculate the accuracies, return the answer
  2. The New Way - Update the lake + tables.... when the API is hit, fetch from the lake, calculate the accuracies, return the answer

[Problem] The new way requires [slot tables] + [handling update events] such that the slots table is updated and goes from "Pending" to "Paying"

Before doing this to the slots table, let's please implement this with the predictions table... such that when a new "Payouts Event" shows up, an existing "Predictions" record is updated w/ it's payout.

idiom-bytes avatar May 27 '24 17:05 idiom-bytes

Priorities have mostly shifted away from pdr-backend. So closing less-critical issues.

trentmc avatar Jan 25 '25 08:01 trentmc