scikit-learn-mooc icon indicating copy to clipboard operation
scikit-learn-mooc copied to clipboard

WIP Use score in tree hyperparameter notebook

Open glemaitre opened this issue 2 years ago • 0 comments

This PR isolate the call to score in the hyperparameter notebook. It is linked to the comment: https://github.com/INRIA/scikit-learn-mooc/pull/464#discussion_r768893865

In this PR, we should therefore address the concern of @ogrisel:

I don't see the point of measuring the scores only the training set. Here we speak about hyper-parameter tuning so this would be confusing to only display the training score. I think this notebook needs to be reworked to do a train test split and the plots should display both training and test errors, or neither.

Maybe the plots should be duplicated to each do 2 subplots: one with the prediction function displayed on top of a scatter plot of the samples of the training set (with the training score in the title) and another with the same prediction function displayed on top of a scatter plot of the samples of the testing set (with the testing score in the title).

And then we should comment on those scores to summarize the impact of the hyper-parameters in terms of the overfitting / underfitting trade-off.

glemaitre avatar Jan 06 '22 08:01 glemaitre