Regularization and Out of Sample Error
Hey there!
I was just curious about the question mark under Eout. Are we saying it is undetermined because it will depend on the extent of the change in Ein and |Eout - Ein|, or can we confidently say that if the model was already overfitting the data then Eout will go down and if it was already underfitting the data then Eout will go up?
In other words, can we make any claims on the extent of change of Ein and |Ein - Eout| based on whether the model was already overfitting or underfitting?
Thanks!
Are we saying it is undetermined because it will depend on the extent of the change in Ein and |Eout - Ein|
This is correct.
or can we confidently say that if the model was already overfitting the data then Eout will go down and if it was already underfitting the data then Eout will go up?
This is not necessarily correct. For example, you could be currently overfitting and so choose to add regularization. But you might add too much, overshoot, and then be underfitting "worse" than you were previously overfitting, resulting in a higher out of sample error even though you moved $\lambda$ in the right direction.
That makes sense, thank you!