[Feature] DWH Insights - Reports for operational use cases
As Elementary has access to the logs of the DWH, we can analyze and create reports that go beyond lineage and health tracking.
-
Use cases we plan to support:
-
Assets importance - We can score datasets based on the number of dependencies, read queries, users, etc. This can help identify important datasets.
-
Usage visibility:
- Updates activity - Provide info about the frequency of updates of datasets, including large gaps (probable SLA breaches), trends, etc.
- Usage activity - Provide info about the usage (read queries) frequency of datasets, including users, trends, etc.
-
Cost and performance optimization:
- Cleanup recommendations - Report on datasets that are not used, to reduce storage costs and operational overhead.
- Jobs performance - Provide info about the performance of repeating queries. This is useful to identify deteriorating queries and changes in resources consumption, so teams could prioritize development efforts and optimize operations.
-
-
As a first step, the insights can be provided as CSV / JSON files or written into tables in the DWH.
-
In the future, we can add reports to the UI and create automated workflows, based on feedback and usage.
Feedback
We would love to hear any feedback / comments / requests about this feature! Specifically what use cases are valuable to you, and if there are others you would want us to address.