Discussion: apply column filter to similarity table in TableReport
As @lesteve rightly pointed out, the fact that the column filter doesn't apply to the table of column similarities is surprising. This PR filters the similarity table: the row containing the similarity of column A and column B only appears if either A or B matches the column filter. There is also a special case for the "Columns with high similarity" filter: only rows corresponding to the high similarities appear
I also noted in the code there is a mix of "interactions", "associations" and "similarities" as our preferred term for this has evolved, I'll address that in a separate PR
I think that I would prefer graying out the selector on this tab than applying it. The reason is that I can see someone moving to the tab and not seeing columns that are highly similar
On Jul 24, 2024, 08:00, at 08:00, "Jérôme Dockès" @.***> wrote:
As @lesteve rightly pointed out, the fact that the column filter doesn't apply to the table of column similarities is surprising. This PR filters the similarity table: the row containing the similarity of column A and column B only appears if either A or B matches the column filter. There is also a special case for the "Columns with high similarity" filter: only rows corresponding to the high similarities appear
I also noted in the code there is a mix of "interactions", "associations" and "similarities" as our preferred term for this has evolved, I'll address that in a separate PR You can view, comment on, or merge this pull request online at:
https://github.com/skrub-data/skrub/pull/1009
-- Commit Summary --
- apply column filter to similarity table
-- File Changes --
M skrub/_reporting/_data/templates/dataframe-columns.css (2) M skrub/_reporting/_data/templates/dataframe-interactions.html (93) M skrub/_reporting/_data/templates/no-filter-matches.html (4) M skrub/_reporting/_data/templates/skrub-report.js (46) M skrub/_reporting/_summarize.py (2) M skrub/_reporting/js_tests/cypress/e2e/column-filters.cy.js (14)
-- Patch Links --
https://github.com/skrub-data/skrub/pull/1009.patch https://github.com/skrub-data/skrub/pull/1009.diff
-- Reply to this email directly or view it on GitHub: https://github.com/skrub-data/skrub/pull/1009 You are receiving this because you are subscribed to this thread.
Message ID: @.***>
@GaelVaroquaux is this what you have in mind?
I'm not sure which I like better, from the user's point of view it's not obvious why the filter they chose would be deactivated when they visit this tab.
OTOH applying the filter isn't super useful, I don't really see a use case for "all similarities involving a numeric column" for example
not seeing columns that are highly similar
I see what you mean but arguably that's precisely the goal of the filter: not to see columns that don't match it. so it might make sense to have it behave consistently across all 3 tabs. Note the default is to show all columns and this only changes if the user manually goes and selects something else.
maybe yet another option could be to gray out the columns that don't match the filter. I'm not sure I like it either: it might not be immediately obvious why some cells are grayed out, and we lose the benefit of saving space by not showing them
:thinking:
In the future we might add more tabs for which filtering columns doesn't make sense; that would be an argument for the "gray-out rather than apply" option.
also, instead of graying it out we could hide it altogether, that might hint better that it doesn't apply (rather than is disabled), and also if users don't see the dropdown they are less likely to notice it and wonder why it's disabled
instead of graying it out we could hide it altogether
you can check #1011 to see what that option would look like
here is the example.
I think this is my preferred option of the 3. WDYT @GaelVaroquaux ?
Personally as a very naive user that clicks around and see what happens, I agree that not having a dialog box at all (rather than one that is greyed out) seems the best option, i.e. https://github.com/skrub-data/skrub/pull/1011
I like the option of having the filter, it could be useful for (i) users that want the same columns they were seeing in the "Table preview" and "Column summaries" tabs when they go to "Column similarities" and (ii) users that have dataframes with many columns. If we're concerned about a user missing very similar columns, we could add a "beware: some very similar columns are not shown here" button
If we go with the other option of disabling this though, I agree with @lesteve that I'd rather have nothing than something greyed out
thanks for following up! ok let's merge the one that completely hides the dialog for now, and I'll leave this one open in case we want to discuss more / revisit when we add some more useful filters
it seems we're fine with the current status (hiding the select when in this tab) so i'll close this issue