Brian Wylie
Brian Wylie
``` ds = DataSource(self.output_uuid, force_refresh=True) ds.details() ds.sample_df() ds.quartiles() ds.outliers() ds.value_counts() ```
Right now the logic will just take top 10s, put them together, and the drop_duplicates(). The issue with this is that you miss the opportunity to pull different outliers.
The algorithm will have multiple options: - add NEW columns with NaNs - throw weird/bad values into NEW columns - add/substitute NaNs into EXISTING columns - throw weird/bad values into...
We have some nice logic that generates an Athena query and then uses that query-id to hop to exactly the right query when clicking on the hyperlink in the datasource/featureset...
There are cases where we want to plot categorical data, so there may be approached like 'swarm' or 'bee' plots that use jitter to show points that would otherwise be...
As part of our longer term vision we plan on having 'bespoke' highly customized applications. So this is a place where we're going to capture links/resources to the examples that...
Interesting way to do presentations, also a good example of badges/ci/cd/etc... https://pypi.org/project/manim/
https://python.langchain.com/en/latest/index.html
This project looks interesting. In which ways does it overlap SageWorks, in which ways it is complementary. :) https://github.com/runllm/aqueduct
So right now the SageWorks AWS Service Broker will poll and pull information about AWS services. That's working out fine but in general polling seems suboptimal. Let's investigate a notification...