optbinning icon indicating copy to clipboard operation
optbinning copied to clipboard

Add multiple graph option

Open lcrmorin opened this issue 9 months ago • 5 comments

I am usually using individual optbinning plots for data exploration. This is done trough a for loop. I was wondering if there could be a default multi-plot for opt-binning. This would be my default function fo data exploration. Typically, nowadays the best option is the pandas hist (see below).

Screenshot 2024-05-01 at 08 19 05

It would be very nice to have such a plot with binning and target dependency for data exploration.

lcrmorin avatar May 01 '24 06:05 lcrmorin

What do your loops look like? Are you using a BinningProcess, getting each underlying variable, and using the associated table's plot method? In which case, we could just add a plot method to BinningProcess that does that? Should OptimalBinning and friends directly expose a plot as an intermediate?

bmreiniger avatar May 01 '24 13:05 bmreiniger

Hi @lcrmorin. Indeed, it would be a nice addition. Would you be willing to work on this feature?

I was trying to do it myself. Ultimately my problem relates more to positioning the plots on the grid, than optbinning itself.

lcrmorin avatar May 05 '24 10:05 lcrmorin

I think adding an ax parameter to the tables' plot method (to be an existing pyplot axis object, or None to create a new one; this is how pandas and sklearn both implement many plotting utilities IIRC?) would be an improvement in general, and would also make this easier. BinningProcess would just have to make_subplots then iterate over zip(axes, _binning_variables)? If I have some time I'll give a PR a go, but happy to let someone else try instead.

bmreiniger avatar May 07 '24 22:05 bmreiniger

I've started looking more seriously at the code, and I don't really use the 2d or pw or streaming binners; should they all support plotting?

In a BinningProcess, should the relevant statistics tables have build run first (with default parameters, or try to pass keywords through based on the type of column?)?

bmreiniger avatar May 15 '24 22:05 bmreiniger