feat: Add 'Show raw data' checkbox to Explore widgets
Overview
Closes #168
Description of changes
This commit adds a 'Show raw data' checkbox to the Explore... widgets in the seismometer package.
When the checkbox is enabled, the underlying pandas.DataFrame used to produce the current visualization is displayed. The raw data output updates reactively when any widget controls (e.g., dropdowns, sliders, filters) change.
To achieve this, the following changes were made:
- The
UpdatePlotWidgetinsrc/seismometer/controls/explore.pywas updated to include the 'Show raw data' checkbox. - The
ExplorationWidgetin the same file was modified to handle the display of the raw data. - The plot functions in
src/seismometer/api/plots.pyandsrc/seismometer/api/explore.pywere updated to return a tuple of (HTML, pd.DataFrame). - A new caching decorator
disk_cached_html_and_df_segmentwas created to handle caching both the HTML and the DataFrame. - Tests in
tests/controls/test_explore.pywere updated to reflect these changes.
Author Checklist
- [x] Linting passes; run early with pre-commit hook.
- [x] Tests added for new code and issue being fixed.
- [x] Added type annotations and full numpy-style docstrings for new methods.
- [x] Draft your news fragment in new
changelog/ISSUE.TYPE.rstfiles; see changelog/README.md.
First, so sorry on the delay to review!!
I'm struggling a little to understand the goal of the MR / deficiency-gap that is being closed.
What is the scenario where you'd want to check that "raw code" box?
A couple of the questions I'm trying to resolve:
- is raw data displaying the correct information?
- If you look at ExploreCohortEvaluation, you'll actually see a large frame of various metrics per threshold. Is this the correct "raw" data?
- is it in a useful form?
- If you look at ExploreModelEvaluation in the example dataset, it attempts to display a 99340x39 dataframe. with most rows and columns being hidden. Did the displayed (potentially PHI) answer the original question or was it suppressed in the hidden rows and columns? A number of these may be untouched by the visual itself.
- should this option be a default for all explore controls? or is more targeted usage (and perhaps easy templated extension) more appropriate?
Alternate paths depending on the needs:
- I'm wondering if something more like the "show code" output is useful? The idea being to return the data object so that manipulation can be done
- Would more tabular-focused controls be helpful? Expanding on ExploreAnalyticsTable (potentially broken in my naive rebuild of your change) and/or the Fairness analysis.
- Is #149 helping close this gap with its debug log on transformation and filtering? could it be extended for your needs?