Jonas Mueller comments

Results 180 comments of


                                            Jonas Mueller

threshold for classifier

If you would like to use a different threshold, please use `predict_proba()` and then apply your own threshold to decide how to map these predicted class-probabilities into predicted class-labels.

threshold for classifier

@rxjx I think we just haven't adjusted threshold internally simply due to lack of bandwidth :) If you'd like to help contribute this, that would be awesome! The steps needed...

Awesome to hear you're interested 👍 I agree graphical representations would be nice to show, note that we do already offer confusion-matrix functionality inside our `evaluate_predictions()` function. You can also...

threshold for classifier

Yes I think an initial PR can just add it this functionality for only binary classification since it's more straightforward and the most common use-case. For other problem-types you could...

Plotting for model performance metrics (ROC curve, AUC, Precision-Recall)?

There are no built-in methods for this. But you can easily do it via sklearn: https://stackoverflow.com/questions/25009284/how-to-plot-roc-curve-in-python

Plotting for model performance metrics (ROC curve, AUC, Precision-Recall)?

@schinto I'm not sure which visualizers will work with directly with autogluon-tabular, which would require minor changes, and which would require major changes. Note that autogluon does implement the key...

Time-based crossvalidation in AutoGluon Tabular?

@Innixma time-based crossvalidation or train/val split does not necessarily correspond to time-series. It may still be useful for regular supervised learning where rows are independent, just not identically distributed over...

Align predict_proba behavior on regression problems

I’m fine with either way. I think it’s not that confusing if predict_proba just calls predict for regression, and unlikely to introduce bugs unless the user didn’t even realize they’re...

Length of labels in Quick start codeblock

Hi, I've just merged a PR to address this confusion, hopefully that will clarify things a bit! https://github.com/cleanlab/cleanlab/pull/331

Length of labels in Quick start codeblock

Labels should have length `N`, where `N` = the number of examples in your dataset. Each example should have one label, which is an integer in {0,1,...,K-1} for a dataset...