Results 396 comments of Jérôme Dockès

I think we can merge this PR. If at some point we are happy enough with the online visualizer to add it to the docs we can do another PR....

> Failing test (a quick look suggests that it is related to the PR). yes I'm on it :) I made a small change on how labels are rotated to...

> Awesome. I definitely see myself using this functionality any suggestion on a short description of those columns for the drop-down? "Columns with high similarity"?

some examples of the kind of cleaning the tablevectorizer does: ```python >>> import pandas as pd >>> from skrub import TableVectorizer >>> skrubber = TableVectorizer( ... high_cardinality_transformer="passthrough", ... low_cardinality_transformer="passthrough", ......

sounds good, why do we need the height: unset?

as skrub has a focus on supervised learning having an optional 'target' parameter for the tablereport that causes it to show slightly different information might be a good idea, and...

The case where I have a list of formats (not just one) but the default list used by pandas is not adequate sounds a bit niche to warrant the added...

> The reason I thought of this was to address the case in which datetimes are using locale specific formats, e.g. French day/month names, which I don't think are parsed...

> @GaelVaroquaux was suggesting to implement SquashingScaler directly instead of with the indirection through SingleColumnSquashingScaler. I'll need to check how easy that is... that's an option too, I'll let you...

> Your point Jerome is about memory overhead? yes because we go from columnar format, to one contiguous array, then back to columnar (dataframe format). that is the case even...