skrub icon indicating copy to clipboard operation
skrub copied to clipboard

TableReport ENH: focus on insights found in EDA

Open sylvaincom opened this issue 1 year ago • 3 comments

Problem Description

Skrub's TableReport is great for EDA (Exploratory Data Analysis) purposes. When I do EDA, I scroll over the TableReport UI and find some insights, such as a the target has some class imbalance, some feature has missing values, etc. But, after exploring the TableReport, there is no easy way for me to do some emphasis on the insights I got from it.

Feature Description

I would like to emphasize the insights I found during EDA on the skrub TableReport.

For example in this Kaggle dataset about flight delays, I want to emphasize the fact that the target dep_delayed_15min has some class imbalance:

Capture d’écran 2024-11-27 à 11 47 54

A solution could be that for each subgraph, I have an option to click on select to emphasize, then at the end I get a TableReport but more concise: only with what I want to emphasize on. It would be kind of a select, but keep for later as a TableReportSummary or TableReportInsights.

Insights could also be to emphasize on some rows in the associations table, as they are counter-ntuitive for example (so being able to click on some rows, not just subplots).

Alternative Solutions

For the class imbalance, with the current skore, the users could select their feature of interest, click on it (copy-paste) then do

TableReport(df["dep_delayed_15min"])

but what I am talking about is more general: insights could also be that I would like to emphasize on some rows in the associations table.

Additional Context

No response

sylvaincom avatar Nov 27 '24 10:11 sylvaincom

thanks @sylvaincom that's interesting! I can certainly see the use case, I've noticed something and I want to come back to it later or share it with someone and would like something better than a screenshot.

do you have any thoughts about how the information should be emphasized and how the user should specify what to emphasize?

jeromedockes avatar Nov 29 '24 12:11 jeromedockes

A bit unrelated and just thinking about this now, should we also have an easy way to dump/save the TableReport? Currently you have to get the HTML and write it yourself, but I could easily see a simpler API for this. WDYT?

Vincent-Maladiere avatar Nov 29 '24 13:11 Vincent-Maladiere

@Vincent-Maladiere why not! that can be a nice "good first issue" for the upcoming sprint

jeromedockes avatar Nov 29 '24 13:11 jeromedockes