Adam Li
Adam Li
If it helps the discussion, I found this [old issue](https://github.com/microsoft/LightGBM/issues/2921) in LightGBM, which seems to reflect their docs (I'm unsure cuz I can't find a specific line mentioning how they...
I imagine there is some overhead due to the potential to query a RNG many times. Tho this would potentially be also an edge case since it implies there are...
I think a warning would be nice but an error message might be overkill because when a NaN pops up in the testing dataset (but not training), ideally our use...
Responding to @glemaitre > In terms of an application use case, I'm also wondering if we should not error/warn if a user starts to provide missing values at test time...
Relevant paper showing empirical evidence that that sending samples to the majority child node is not as good as "random" when the sample contains an unseen category during training. In...
I spoke with @betatim today about this issue, and to summarize, I think a good strategy is the following (hopefully he agrees :p): If no missing-values are encountered during training,...
> > If no missing-values are encountered during training, then flip a coin and set the missing value traversal to be random. > > Is there literature that backs this...
I'm happy to help address this issue. For the scikit-learn core devs, can someone educate me on why we should store proportions? I see a few paths forward: 1. when...
> * first fix the example to have a consistent explanation and make `value` in `plot_tree` the same as `tree_.value` (you probably need fixed width formatting similar to `{tree_.value:.3f}` to...
We can wait to see the sparse refactor then. I think a warning message is still warranted when the user uses the relevant functions (not during import)