Simon Brugman

Results 195 comments of Simon Brugman

The correlation between categorical-categorical and categorical-numerical are already available (e.g. check the Phik correlation)! Regarding the interactions however, there are indeed multiple options to extend the functionality.

Tracked this down to this line: https://github.com/ydataai/pandas-profiling/blob/develop/src/pandas_profiling/model/alerts.py#L103

The alert messages are currently defined in the file in the previous comment for JSON, while for the HTML report they are defined in the [alert templates](https://github.com/ydataai/pandas-profiling/tree/develop/src/pandas_profiling/report/presentation/flavours/html/templates/alerts). For consistency, the...

@josephramon Thanks for reporting this issue. With the latest version of the package and the code below, I could not reproduce the issue (Kendall's is disabled, and no warning). The...

[cuDF](https://github.com/rapidsai/cudf) would be a candidate backend for this. Since the codebase is already refactored to support multiple backends, and soon will support spark, only the cuDF-specific operations need to be...

It's still computing (see *), just is slow for these 28x28 =784 plots. You can either turn this off or limit to 28xn where n = the number of target...

@enesMesut Will be improved in the next version. For now either turn the scatterplots off (`interactions={'continuous': False}`) or select particular columns for which you're interested in obtaining them (`interactions={'targets':['col1', 'col2']}`)

Could you provide a dataset to reproduce? Did you test against prior versions of this package?

What settings are you using? There is always a trade-off between the performance and which statistics are generated. For instance, on a similar dataset as you mention with the `minimal=True`...

@arita37 Sounds good. Would you be interested in contributing a PR and work out the sketched solution?