spark-sklearn
spark-sklearn copied to clipboard
Multiple scorers
Newest SKLearn docs(http://scikit-learn.org/stable/modules/generated/sklearn.model_selection.GridSearchCV.html) say that:
scoring: For evaluating multiple metrics, either give a list of (unique) strings or a dict with names as keys and callables as values.
However, spark-sklearn does not seem to accept scoring parameter in a form of list.
Is there an easy way to accomplish that(submitting few scorers, eg. scoring = ['accuracy', 'f1', 'roc_auc', 'average_precision']? Or is it perhaps in your roadmap? If not, how much time(approximately) would the implementation take for somebody new to the project(like me)?
Hi @spaszek thanks for reporting this potential improvement. I'm afraid we have very limited bandwidth to work on spark-sklearn, though it will be good to know if others need this feature so that we can prioritize.
@jkbradley Fortunately I have my own bandwidth to spare for a good cause. Could I implement this feature? :)
I would also really like this feature.
@spaszek Is there any update for this? were you already implement this? Thanks
Dear @jkbradley,
i'm pretty sure that many of us really need this feature since we want to measure the model performance using multiple metrics. This make sense when doing classification, usually you want to measure accuracy, precision, and recall at once. I hope you want to reconsider this to higher priority. Thank you!
@hadyan-tvlk Unfortunately no - I'm still waiting for @jkbradley to respond whether databricks even accepts other people PRs... ;P
I would really like this feature, too. Is there any plan to include it soon?