Andreas Mueller
Andreas Mueller
The problem with integer is that if you use the same CV class for regression and classification, you don't know whether you should bin or not.
By binning I meant fixed bin width. If you have 101 samples and put them into 20 bins, most bins will not have 5 samples in them. The sorting would...
Exactly. But adding a separate object for regression seems like a nice and easy extension.
helper functions are fine in `_split.py` if they are only used there.
@iFe1er looking for references. Feel free to do the literature search or provide benchmarks. I feel like it's a "sensible" thing to do, so I'm not sure what exactly our...
@jnothman I'm curious about the text-book reference. In a statistics book? Pretty sure not in an ML book (unless it was inspired by scikit-learn lol). Can you provide the argument...
> You don't think stratified K fold for classification is in textbooks? Not in any machine learning text-book I know of, I think? Not in ESL, Bishop, Kuhn, Murphy, Barber,...
Check out: - [scikit-optimize](https://scikit-optimize.github.io/) - [spearmint](https://github.com/HIPS/Spearmint) - [GPyOpt](https://sheffieldml.github.io/GPyOpt/) - [RoBo](http://automl.github.io/RoBO/index.html) - [SMAC3](http://www.ml4aad.org/algorithm-configuration/smac/)
I agree with @GaelVaroquaux above in that this is kind of an unsolved (and maybe unsolvable) problem. I think we should document the caveats but include the feature. You will...
I agree on not adding ``set_scorer``. Also agree with the rest of what @adrinjalali suggested. > I would say yes but it means that we need to properly define what...