pulseui
pulseui copied to clipboard
Caching and Aliasing Queries
Users:
- user-bg user-sit - Want to join slow and fast data from different sources. https://github.com/timestored/pulseui/issues/264
- user-ab user-sit - Wants to join kdb to SQL (diff speed not issue)
- user-bl user-sno - Wants to do one big query at top, then drill down into it.
- user-ek -user-bof - Wanted to have reusable functions to allow checking the aliases.
- Quants in general used notebooks to cache data then drill down into it. @electroly emphasised this cache once, then drill down was how he safely investigates data even though he has full DB access.
Facts
- Users would find this caching / drilldown / prod-offload very useful.
- The ownsers of the database would prefer fewer queries against production.
- BUT we are not building a database, we are supporting multiple databases. Before every implementation we should ask, could this be done using a database itself?
- We are supporting fast in-memory by using H2, it's fast but it's also not moving forward like DuckDB. So not for complex queries. Feels like the right choice for simple caching that we want.
Questions
- What do other systems do?
Options:
- Allow configuring queries on the dashboard and pulse level that users can then use as aliases.
e.g.
select * from {{@orders}} where customer=`bob - Recommend they use QUERY_DB('SLOWSQL','SELECT * FROM CLIENTS', 30100) with a timer. That pushes it to the dashboard level.
- Allow users to specify an alias for a query within a dashboard i.e. on the component editor. Main problem is that this would not carry over between sheets!! User could always look at sheet1, that relies on query on sheet3 so stops working.
- We bundle DuckDB, they could upload CSVs or parquet into a DuckDB instance.
user-andrew - wanted to drill down or filter into one table and all other UI updates to show that subset.
- The one big table then drill-down - is current operating policy and simplifies other parts of development. e.g. Only hits production database once, rest is within current middleware. Allows dragging and dropping to generate charts by less experienced users. Cons = Hard to gauge memory usage when first adding a new table. Scalability to other ways of working, e.g. real-time.
- Users love using filters , bookmarking them and they "feel like an analyst with all the power" including being able to export that one big table.
- Dropdown on bar charts to select what is charted (useful to only perform one query rather than separate).
- Click / Double-click to zoom in on data.
Proposed UI
- Caching query is set on timer with target table:
- WIthin dashboard, can query "select * from tbltest" within CacheDB.
- Notice it only updates on same time as cache query timer.
- Data is periodically saved to folder and only permitted tables are accessible from each dashboard.