pdr-backend
pdr-backend copied to clipboard
[Epic][Lake][Analytics] Create Analytics Dapp with Predictoor Income plot
Motivation
We want to go from tables and etl, to apps with interactive plots. The two frameworks we may consider right now are Streamlit and Dash.
Framework Considerations
Dash is OSS, fully fledged so, it delivers things like url routing out of the box. It also has enterprise support, so there is that potentially unblocking limitations we may hit. Dash is known for requiring you to know the framework (which makes sense). https://dash.plotly.com/
Streamlit is leaner, w/ more community projects/plugins used to solve individual needs Things like routing are additional, and require development. https://discuss.streamlit.io/t/is-there-a-way-to-have-multiple-pages-like-flask/16823
I originally used Streamlit because I like it's native matplotlib and python graphing library support. However, I'm not sure how much of an uphill battle and duck tape it will take to achieve basic web-app functionality such as: routing, caching, etc... Due to previous usage w/ Svelte (dao) by the team and then react (predictoor), I would expect the team to perhaps lean more towards Dash.
The lowest cost/friction would be plotly/streamlit. The most enterprise/mature solution is plotly/dash. The most community/open solution is matplotlib/streamlit.
Data Outline
DoD:
Prepare Data:
- [x] ETL - Keep bronze_prediction table as close to "raw", but clean of any additional parameters. #610
- [x] ETL - Create silver_table that enriches information with new columns, and aggregates by [user_id, contract_id]: sum_stake_revenue, sum_df_revenue, ... #610
- [x] ETL - Silver table will update incrementally meaning that it should resume from the last_date it ran. #610
- [x] ETL - Silver table will update historically using earliest [user, contract] event. If a user skips claiming, then claims an early record, their history grouped by [user, contract] is updated. #610
[Moving from silver-table to streamlit]
Create Net Income Plot + Single Page App:
- [x] Review Calina's PR for streamlit, and replicate the cli + streamlit_entrypoint, such that it's easy for you to get the same streamlit environment going. #612
- [x] Look at Calina's changes for
SimPlotter.__init__()such that you understand how to setup the sim_plotter page, how to setup columns, contianers, and how to initialize plots. Please note my comments. #612 - [x] Create your own PredictoorIncomePlotter class, and implement plots that use data from silver_pdr_predictions table to visualize all data. #612
- [x] Plot - Create Predictoor Net Income plot #612
- [x] Dapp - Embed plot into web app #615
Create Revenue + Expense Plot + Single Page App:
Display Feeds and Predictoors tables
- [ ] Create a table that stores all the Feeds data(name, exchange, time)
- [ ] Display Feeds table
- [ ] Calculate APYs
- [ ] Display Predictoors table containing: address, APY, Net income, Gross income, Costs
We should now have a single-page app (SPA), first-cut dapp completed, having Predictoor Income plot, running locally w/ 3 charts. There are 2 simple UI elements for filtering, and everything is updating.
We should now consider:
- Testing this SPA on different frameworks (should be easy to test in streamlit vs. dash)
- Testing this off-localhost (deployed live)
- How to improve caching (loading)
- How to make it live (refresh plots w/ fig.batch_update type functionality)
pdr analytics prototypes in GSlides: https://docs.google.com/presentation/d/18y_nfpc3e-dop5NtfIisgUkc-oPt8dft9WyAnqhSqZg/edit#slide=id.g20f5357f625_0_1545
[Frameworks & Plots] Based on previous history, I estimate the team will feel that:
- matplotlib is too detailed
- streamlit is too vanilla
As I mentioned in chats, I'm recommending the least amount of time getting a basic plot/app in place to stress test it.
This means plotly/streamlit.
@KatunaNorbert please note that I updated the table design/schema, such that the calculated tables + aggregates are taking place at the silver-level.
This is such that our bronze_tables remain: "close-to-raw, clean" While silver tables remain: "enriched + aggregated"
Got it, sound good @idiom-bytes
Recall our recent conversation why we don't want to use "Dashboard" label:
- "Dashboard" implies a relatively static set of plots, with little interactivity and only mild discovery (browse, search, filter).
- Whereas what we want is a dapp that has highly interactive plots and powerful discovery.
Accordingly, in this issue's title and description, I replaced "Dashboard" with "dapp" & "dapp with plots". Please ensure other issues follow this too. Thank you:)
Updating tickets...
- This work has been completed using polars + streamlit.
- We need to complete the work w/ duckdb + dash in order to get this working w/ latest. This is still the target.