sga() and create_binning() have different default nbins
Also, use different defaults for categorical (e.g. 4) and numerical (e.g. 8).
Hi I can work on this. This is my first contribution. A little guidance would be helpful.
Hi @aravs16 , thanks for volunteering, this issue is mostly about syncing up the default arguments in
https://github.com/zalando/expan/blob/dev/expan/core/experiment.py#L715
and
https://github.com/zalando/expan/blob/dev/expan/core/binning.py#L732
Ideally we could somehow distinguish between categorical and numerical bins, but if that's not possible, we may also completely leave out the argument nbins, since a random number probably would not make sense. Let us know if anything is unclear;-)
@jbao Thank you!
Can we make nbins = 8 in both places and then pass nbins//2 on this line: https://github.com/zalando/expan/blob/dev/expan/core/binning.py#L768 ?
hmm, I think we should make nbins transparent to the user, and not doing any post-processing within the function itself, what do you think?
Hi @jbao, in my opinion, we should set nbins=None in definitions of sga() and creat_binning(). And add post-processing ≈ if nbins == None: nbins = 8 if is_numeric else 4 followed with a warning.
Hi everyone,
I found using None as a flag for default (as opposite to undefined) value quite pervasive in ExpAn (for example, None in place of KPI names or feature names implies "do all"). Was it a deliberate design decision? What happens to those options where we want None to signalize the absence of information, rather than the intention to use default values?
I think we should specify the default value explicitly in the argument if we have one (e.g. in the case of nbins), and use None to imply that the function will take all possible input values (e.g. in the case of kpi_subset). What do you think?